Re: Sphinx4 + German VoxForge AcousticModel - Maybe a newbie problem

German

Flat

Sphinx4 + German VoxForge AcousticModel - Maybe a newbie problem

User: Andreas
Date: 11/1/2013 6:16 am

Views: 11269
Rating: 0

Hello!

i just started using Sphinx4 and I want to use the German Acoustic Model from VoxForge (http://www.voxforge.org/de/Downloads)

As a first test, i want to set German as language model by using the LatticeDemo at Sphinx4.

After downloading the Sphinx4 source and VoxForge Acoustic model, i duplicated the file WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar and named it TEST_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar. I replaced the files inside the TEST_8gau_13... by the german VoxForge Acoustic Model (files like means, variances, mdef, cmudict).

At least, i changed the config.xml (replaced WSJ_8gau.. file by the duplicated and changed TEST_8gau... file):

<component name="dictionary"type="edu.cmu.sphinx.linguist.dictionary.FastDictionary">
        <property name="dictionaryPath" value="resource:/TEST_8gau_13dCep_16k_40mel_130Hz_6800Hz/dict/cmudict.0.6d"/>
        <property name="fillerPath" value="resource:/TEST_8gau_13dCep_16k_40mel_130Hz_6800Hz/noisedict"/>
        <property name="wordReplacement" value="&lt;sil&gt;"/>
        <property name="unitManager" value="unitManager"/>
    </component>
    <component name="trigramModel" type="edu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel">
        <property name="unigramWeight" value=".5"/>
        <property name="maxDepth" value="1"/>
        <property name="logMath" value="logMath"/>
        <property name="dictionary" value="dictionary"/>
        <property name="location" value="./models/language/en-us.lm.dmp"/>
    </component>
    <component name="wsjLoader" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">
        <property name="logMath" value="logMath"/>
        <property name="unitManager" value="unitManager"/>
        <property name="location" value="resource:/TEST_8gau_13dCep_16k_40mel_130Hz_6800Hz"/>
    </component>

I didnt replaced the trigramModel by a german version. Where does there exist a german trigramModel? Is that the problem? I could not remove the configuration for testing purpose...
I am also unsure with the german dictionary (renamed as cmudict.0.6d). As example, it contains the following:

A	  qq aa: 
AB	  qq a p 
ABBEDINGUNG	  qq a p b @ d ii nn uu nn

Is the pronounciation format/syntax correct?

After starting the LatticeDemo.java, i get a lot of warnings like the following:

The dictionary is missing a phonetic transcription for the word 'recommended'

Ok, something is wrong. I got it, but what?

I tested my buggy configuration with a recorded audio file (16bit, 16.000Hz, mono) but it does not recognized any word :(

I would be very happy, if someone can help me...

Many thanks.

Re: Sphinx4 + German VoxForge AcousticModel - Maybe a newbie problem

User: nsh
Date: 11/1/2013 6:29 am

Views: 567
Rating: 1

> Is the pronounciation format/syntax correct?

These particular entries are correct, but actually dictionary is very dirty and has many broken pronunciations and must be fixed

> I didnt replaced the trigramModel by a german version. Where does there exist a german trigramModel? Is that the problem? I could not remove the configuration for testing purpose...

Yes, it is a problem. A test lm is in in model archive in etc/voxforge_de_sphinx.lm. You can convert it to dmp with sphinx_lm_convert. You can create your own model according to CMUSphinx tutorial.

Re: Sphinx4 + German VoxForge AcousticModel - Maybe a newbie problem

User: Andreas
Date: 11/17/2013 10:04 am

Views: 114
Rating: 0

Many thanks! I converted the file with success!

Now, the test.wav which is included at the acoustic model archive is working like a charm.

After them, i started with my first voice recording test.

I recorded the number three (in german it is called "neun"; the spelling is "N OY N") with my Smartphone twice.

I upload the wav file (http://www.xup.in/dl,21058232/neun-neun-g2-recorded.wav/). The file is recorded as 16000 Hz, Mono, 16bit.

I used the LatticeDemo and replaced the test.wav with my recorded wav.

The expected result would be: neun (N OY N) neun (N OY N)
But, the given result is: neun (N OY N) ein (AI N)

Only the first word is recognized correctly.

My question is, where should i start to improve the result?
I do not know what is the next step to "fix" the problem?

I also tried it with other words and different microphones, but the result is most of the time wrong :(

Is it a general problem with the german acoustic model?

I uploaded the reduced log file of the posted wav file at Google Drive with comments. It would be grateful if someone can help me:
https://docs.google.com/document/d/1LaQGg7iq7SRm-6ucZRYtd0I5PwsMAc4GNnRCWZDPNS0/edit?usp=sharing

Many thanks

Andreas

Re: Sphinx4 + German VoxForge AcousticModel - Maybe a newbie problem

User: nsh
Date: 11/17/2013 3:22 pm

Views: 143
Rating: 0

> My question is, where should i start to improve the result?

Start with reading CMUSphinx tutorial and the FAQ:

http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor

> I do not know what is the next step to "fix" the problem?

You need to retrain the German voxforge model with a proper dictionary.

> Is it a general problem with the german acoustic model?

Yes, German model is pretty inaccurate.

> I uploaded the reduced log file of the posted wav file at Google Drive with comments. It would be grateful if someone can help me.

You didn't grant the access you shared to the files unfortunately.

Re: Sphinx4 + German VoxForge AcousticModel - Maybe a newbie problem

User: Andreas
Date: 12/8/2013 7:35 am

Views: 5684
Rating: 0

Thanks, that helped

Previous • Next •


Username	Password