Click here to register.


Adapting the german voxforge model
User: Martin112
Date: 4/3/2017 12:41 pm
Views: 172
Rating: 0


I am trying to adapt the german voxforge model (cmusphinx-de-voxforge-5.2.tar.gz with the appropriate lm and dictionary). 

I have done this according to the guide on the cmu sphinx homepage (, using MLLR-transforming.

Then I tested the result using pocketsphinx_batch and

Unfortunately, the detection rate has dropped significantly from 59% to 19%, which is why I am now looking for my fault.

I've done the following steps to adapt:

1. I created 30 german records (16kHz Mono) and created the related .fileids- and .transcription-file.

2. creating acoustic feature files: 

    sphinx_fe -argfile de-de/feat.params -samprate 16000 -c adapt30.fileids -di . -do . -ei wav -eo mfc -mswav yes

   (I renamed the acoustic model directory to de-de)

3. Accumulating observation counts: 

    bw -hmmdir de-de -moddeffn de-de/mdef -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -dictfn voxforge.dic -ctlfn adapt30.fileids -lsnfn adapt30.transcription -accumdir .

   (I renamed the dictionary to voxforge.dic)

4. MLLR-transforming: 

    mllr_solve -meanfn de-de/means -varfn de-de/variances -outmllrfn mllr_matrix -accumdir .

5. Update the means-file: 

    mllr_transform -inmeanfn de-de/means -outmeanfn de-de/means-new -mllrmat mllr_matrix 

   (and renamed means-new to means)

For testing i used the command: 

pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl test.fileids -lm voxforge.lm.bin -dict voxforge.dic -hmm de-de -hyp test.hyp


Can you tell me if something is wrong with this approach? I'm aware that the 30 recordings are not much, but according to my understanding, the recognition rate should not drop so much.

I would be grateful for every note.


Thanks in advance



Re: Adapting the german voxforge model
User: nsh
Date: 4/3/2017 6:15 pm
Views: 22
Rating: 0

You forgot

    -lda de-de/feature_transform

on stage 3.


Re: Adapting the german voxforge model
User: Martin112
Date: 4/8/2017 10:11 am
Views: 6
Rating: 0

Many thanks, the detection rate has now increased 65%.

I guess this is a normal result for the low number of records?

Re: Adapting the german voxforge model
User: nsh
Date: 4/9/2017 1:19 pm
Views: 26
Rating: 0

It is hard to give you an accuracy advise without seeing the data and understanding the whole situation. You need to provide the test set.

You can check our tutorial on tuning the accuracy for details.

Re: Adapting the german voxforge model
User: Martin112
Date: 4/24/2017 12:18 pm
Views: 8
Rating: 0

I added an attachment with my training data and all generated files, with exception of the model.


The result:

TOTAL Words: 40 Correct: 27 Errors: 14

TOTAL Percent correct = 67.50% Error = 35.00% Accuracy = 65.00%

TOTAL Insertions: 1 Deletions: 0 Substitutions: 13

(see the attachment for the complete result)


I also created bigger test sets (around 100 words), but the accuracy is not getting better. So I would like to be able to estimate, if it's worth testing with a few thousand words. 

I am grateful for any help :)
Re: Adapting the german voxforge model
User: nsh
Date: 4/24/2017 3:57 pm
Views: 18
Rating: 0

If you have more adaptation data you'd better use MAP adaptation, not MLLR adaptation.

I would also use smaller language model more specific for your application. With generic language model it is not going to work very accurately.