German

Nested
Re: Localizing the SpeechSubmission App to German
User: kmaclean
Date: 1/11/2008 10:10 pm
Views: 369
Rating: 30

Hi Timo,

Thanks for your updates to Richard's translations.

With respect to your comments about German Copyright laws, and the impossibility of assignment of Copyright in Germany, I've been taking the approach to ask users to assign the fullest possible 'title' (if that is even possible in a Copyright context) of their submission to the Free Software Foundation, and hope that if it ever became an issue, a judge would look to intent and essentially give an irrevocable license to FSF.  Worse case, the assignment is ruled invalid and we fall back on GPL.  

BTW - the German version of the speech submission application is live.  Please take a look at it and let me know if there are any changes required.  I just used some of Rlaph's early prompts.

Ken 

 


Re: Localizing the SpeechSubmission App to German
User: kmaclean
Date: 1/13/2008 1:28 pm
Views: 281
Rating: 31

Hi Richard,

The German localization for Speech Submission is live ... let me know if anything needs to be changed.

thanks,

Ken 

Re: Localizing the SpeechSubmission App to German
User: atterer
Date: 1/13/2008 2:14 pm
Views: 521
Rating: 31
Hi Ken, it all looks fine to me! :-)
German "Sonderzeichen" (�,�,�,�)
User: ralfherzog
Date: 2/8/2008 1:40 pm
Views: 287
Rating: 26
Hello,

I have just listened to a recording that had been submitted using the speech submission application (German localization).  The quality of the recording was very good.

But there is one small thing: on my computer, the German "Sonderzeichen" (ä, ö, ü, ß) aren't displayed correctly.  I am using Windows XP professional (English language).  It would be better if those graphemes would be displayed correctly by the German speech submission application.

And by the way, thanks for using my prompts!  

Greetings, Ralf
Re: German "Sonderzeichen" (�,�,�,�)
User: kmaclean
Date: 2/8/2008 5:41 pm
Views: 314
Rating: 31

Hi ralf,

Thanks.

Robin noticed this a few weeks ago ... I just have not had a chance to look in to it. 

It seems like it only occurs on Windows.  Everything seems to display OK on Linux (FC6) ... so much for Java being write-once run anywhere :)

see ticket 321 -  Windows: SpeechSubmission app for German - umlauts not displaying properly.

Ken 

 

Re: German "Sonderzeichen" (�,�,�,�)
User: kmaclean
Date: 2/8/2008 5:44 pm
Views: 257
Rating: 25

Hi Ralf,

One other thing ... are the prompts I used OK?  Would there be another set that would be better (assuming the characters are fixed)?

thanks,

Ken 

 

integration of prompts (de1, de2, de3, ..., de100)
User: ralfherzog
Date: 2/9/2008 6:59 pm
Views: 1487
Rating: 32
Hi Ken,

OK, so you knew already about the problem with the special characters of the German language.

In my opinion, all of my prompts should be OK. So if you want, you can implement all of my prompts (de1, de2, de3, ..., de100).  At the moment, I am preparing to submit more prompts.  It should be possible to build a not too bad first statistical model (language model or acoustic model - I don't know the difference) of the German language, at least I hope so.

I try to submit normal sentences of the German language.  Most of those sentences should be of a medium level - not too easy and not too complicated. That means I'm trying to cover a lot of situations, and a lot of words.  And those words should have a distribution that is typical for the German language.  To achieve this goal, it is necessary to submit much more prompts than I already have submitted.  I will continue the work.  And I hope that other speakers will follow.  This was a lot of work dictating them with Dragon NaturallySpeaking, and editing them.

There shouldn't be major mistakes in my prompts.  It would be good if other persons would use my prompts.  They don't have to create their own prompts.  I have done the first steps.  So this should be a good basis.

My prompts should build a whole unit.  So why not integrate all of them into the VoxForge speech submission application?

Greetings, Ralf
Re: integration of prompts (de1, de2, de3, ..., de100)
User: speechsubmission
Date: 2/11/2008 12:28 pm
Views: 404
Rating: 31

Hi Ralf,

thanks for the feedback. 

>language model or acoustic model - I don't know the difference

From the VoxForge Tutorial:

All Speech Recognition Engines ("SRE"s) are made up of the following components:

  • Language Model or Grammar - Language Models contain a very large list of words and their probability of occurrence in a given sequence.  They are used in dictation applications.  Grammars are a much smaller file containing sets of predefined combinations of words.  Grammars are used in IVR or desktop Command and Control applications.   Each word in a Language Model or Grammar has an associated list of phonemes (which correspond to the distinct sounds that make up a word).
  • Acoustic Model - Contains a statistical representation of the distinct sounds that make up each word in the Language Model or Grammar.  Each distinct sound corresponds to a phoneme.
  • Decoder - Software program (like Sphink, Julius, HTK's HVite) that takes the sounds spoken by a user and searches the Acoustical Model for the equivalent sounds.  When a match is made, the Decoder determines the phoneme corresponding to the sound.  It keeps track of the matching phonemes until it reaches a pause in the users speech.  It then searches the Language Model or Grammar file for the equivalent series of phonemes.  If a match is made it returns the text of the corresponding word or phrase to the calling program. 

>So why not integrate all of them into the VoxForge speech submission application?

Unfortunately, we are getting to the point where I need to create separate builds of the SpeechSubmission app for each language, otherwise the size of the downloadable application will get to big.  I will add this an RFE in Trac.

Ken 

eliminating prompts with special characters
User: ralfherzog
Date: 2/14/2008 5:47 pm
Views: 223
Rating: 21
Hello Ken,

A few weeks ago, Timo had generated the first edition of the German pronunciation lexicon.  This was a very important step, we all know that.  How is it possible to use this pronunciation lexicon to create a first edition of the German acoustic model?  Is there any one who can do this job?

Creating separate builds of the speech submission application for each language is probably a lot of work.  It could be a workaround to eliminate those sentences which contain special characters of the German language.

Greetings, Ralf
Re: eliminating prompts with special characters
User: kmaclean
Date: 2/14/2008 9:47 pm
Views: 228
Rating: 23

Hi Ralf,

>How is it possible to use this pronunciation lexicon to create a first edition of the German acoustic model?

The VoxForge Tutorial shows how to do it for English.  You should be able to create a workable triphone acoustic model by doing step 1-9, using German prompts and pronunciation dictionary.

To be able to complete Step 10 and create tied-state acoustic models you need a German tree.hed script.  For more information on how to create a tree.hed file for a new language, see the following links:

>Is there any one who can do this job?

Unfortunately, I can't do this right now.  My current focus is segmenting all the LibriVox audiobook submissions -  some date back to June of last year  :(  , and squeezing in another release of the speech submission app (for Italian and Russian).  So it will be a while before I can look at this.

>  It could be a workaround to eliminate those sentences which contain

>special characters of the German language.

Thanks for the suggestion (I like easy workarounds ...) but there must be an easy way to address this in Java - some unicode settings that I have missed ... 

Ken 

PreviousNext