Acoustic Model Discussions

Nested
How to add romanian language
User: bogdan
Date: 2/15/2009 4:58 am
Views: 7447
Rating: 4

Hi,

 

I like to add romanian lagnuage to voxforge but I don't know where to start.

First I want to say that romanian language is very phonetic, and I wonder if is an easy way to add it to voxforge (julian/sphinx or anything else).

For more info about Romanian language please take a look here http://en.wikibooks.org/wiki/Romanian/Pronunciation_and_alphabet

http://www.phon.ucl.ac.uk/home/sampa/romanian.htm

 

Thanks,

BogDan,

 

--- (Edited on 2/15/2009 4:58 am [GMT-0600] by bogdan) ---

Re: How to add romanian language
User: nsh
Date: 2/18/2009 2:19 am
Views: 68
Rating: 5

Hi Bogdan

It's not a problem to add Romanian, but we need a several hours of properly transcribed recordings. Record a book or a talk, try to record as much speakers as you can. Once you'll have enough data it will be easy to add everything else.

--- (Edited on 2/18/2009 2:19 am [GMT-0600] by nsh) ---

Re: How to add romanian language
User: bogdan
Date: 2/18/2009 4:17 am
Views: 71
Rating: 6

Hi

I want to tank you for you response. I can ask  some friends from radio, they have many hours and they have the transcription too, this will help me to add romanian language ?

What is the audio format I need ?

The transcription should be like subs (with time) or only the plain transcription  ?

Thanks,

BogDan,

--- (Edited on 2/18/2009 4:17 am [GMT-0600] by bogdan) ---

Re: How to add romanian language
User: nsh
Date: 2/18/2009 3:34 pm
Views: 2782
Rating: 6

> I want to tank you for you response. I can ask  some friends from radio, they have many hours and they have the transcription too, this will help me to add romanian language ?

We accept only a free GPL-licensed audio. The recordings from a radio could have license issues. Also, there might be issues with speech variability. If you have nothing else, it would be nice to use radio recording

> What is the audio format I need ?


Check the FAQ

http://www.voxforge.org/home/docs/faq/faq/what-kind-of-audio-formats-is-voxforge-looking-for#E95ZmEBb3zy2k4HutEWUxw

> The transcription should be like subs (with time) or only the plain transcription  ?

Check the FAQ

http://www.voxforge.org/home/docs/faq/faq/what-is-transcribed-or-annotated-speech-audio-file#o49C1POV3O54wFLNGupEhQ

And the examples

http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Original/48kHz_16bit/


Basically audio should be in 5-20 words chunks each with a proper transcription.

--- (Edited on 2/18/2009 3:34 pm [GMT-0600] by nsh) ---

PreviousNext