VoxForge
Hi Timo,
Thanks for your updates to Richard's translations.
With respect to your comments about German Copyright laws, and the impossibility of assignment of Copyright in Germany, I've been taking the approach to ask users to assign the fullest possible 'title' (if that is even possible in a Copyright context) of their submission to the Free Software Foundation, and hope that if it ever became an issue, a judge would look to intent and essentially give an irrevocable license to FSF. Worse case, the assignment is ruled invalid and we fall back on GPL.
BTW - the German version of the speech submission application is live. Please take a look at it and let me know if there are any changes required. I just used some of Rlaph's early prompts.
Ken
Hi Richard,
The German localization for Speech Submission is live ... let me know if anything needs to be changed.
thanks,
Ken
Hi ralf,
Thanks.
Robin noticed this a few weeks ago ... I just have not had a chance to look in to it.
It seems like it only occurs on Windows. Everything seems to display OK on Linux (FC6) ... so much for Java being write-once run anywhere :)
see ticket 321 - Windows: SpeechSubmission app for German - umlauts not displaying properly.
Ken
Hi Ralf,
One other thing ... are the prompts I used OK? Would there be another set that would be better (assuming the characters are fixed)?
thanks,
Ken
Hi Ralf,
thanks for the feedback.
>language model or acoustic model - I don't know the difference
From the VoxForge Tutorial:
All Speech Recognition Engines ("SRE"s) are made up of the following components:
- Language Model or Grammar - Language Models contain a very large list of words and their probability of occurrence in a given sequence. They are used in dictation applications. Grammars are a much smaller file containing sets of predefined combinations of words. Grammars are used in IVR or desktop Command and Control applications. Each word in a Language Model or Grammar has an associated list of phonemes (which correspond to the distinct sounds that make up a word).
- Acoustic Model - Contains a statistical representation of the distinct sounds that make up each word in the Language Model or Grammar. Each distinct sound corresponds to a phoneme.
- Decoder - Software program (like Sphink, Julius, HTK's HVite) that takes the sounds spoken by a user and searches the Acoustical Model for the equivalent sounds. When a match is made, the Decoder determines the phoneme corresponding to the sound. It keeps track of the matching phonemes until it reaches a pause in the users speech. It then searches the Language Model or Grammar file for the equivalent series of phonemes. If a match is made it returns the text of the corresponding word or phrase to the calling program.
>So why not integrate all of them into the VoxForge speech submission application?
Unfortunately, we are getting to the point where I need to create separate builds of the SpeechSubmission app for each language, otherwise the size of the downloadable application will get to big. I will add this an RFE in Trac.
Ken
Hi Ralf,
>How is it possible to use this pronunciation lexicon to create a first edition of the German acoustic model?
The VoxForge Tutorial shows how to do it for English. You should be able to create a workable triphone acoustic model by doing step 1-9, using German prompts and pronunciation dictionary.
To be able to complete Step 10 and create tied-state acoustic models you need a German tree.hed script. For more information on how to create a tree.hed file for a new language, see the following links:
>Is there any one who can do this job?
Unfortunately, I can't do this right now. My current focus is segmenting all the LibriVox audiobook submissions - some date back to June of last year :( , and squeezing in another release of the speech submission app (for Italian and Russian). So it will be a while before I can look at this.
> It could be a workaround to eliminate those sentences which contain
>special characters of the German language.
Thanks for the suggestion (I like easy workarounds ...) but there must be an easy way to address this in Java - some unicode settings that I have missed ...
Ken