Click here to register.

Test Acoustic Model Using HTK

1. You need to tell HTK where all your feature vector files are located (those are the mfcc files you created in the last step).  You do this with with a script file. 

Therefore, create a file called:  test.scptest.scp

2. You also need a configuration file.  Create a file called 'config' in your 'voxforge/test' directory and add the following data:

TARGETKIND = MFCC_0_D_N_Z
TARGETRATE = 100000.0
SAVECOMPRESSED = T
SAVEWITHCRC = T
WINDOWSIZE = 250000.0
USEHAMMING = T
PREEMCOEF = 0.97
NUMCHANS = 26
CEPLIFTER = 22
NUMCEPS = 12

3. Next use HTK's HVite command to recognize the test data as follows:

a) If you created your Acoustic Model using the How-to or the the Tutorial, execute the following command:

$HVite -A -D -T 1 -H macros -H hmmdefs -C config -S test.scp -l '*' -i recout.mlf -w wdnet -p 0.0 -s 5.0 ../lexicon/voxforge_lexicon tiedlist

b) if you Adapted the VoxForge Speaker Independent Acoustic Models to your voice using the adaptation tutorial, the execute this command (because you adapted using HTK version 3.2.1, then you must use the HTK-3.2.1 version of HVite):
$/htk/htk-3.2.1/HVite -A -D -T 1 -H macros -H hmmdefs -C config -S test.scp -l '*' -i recout.mlf -w wdnet -p 0.0 -s 5.0 ../lexicon/voxforge_lexicon tiedlist

This will create the following file: recout.mlfrecout.mlf

4. Finally, run the following command to determine the actual recognition performance of the Acoustic Model:
$HResults -I testref.mlf tiedlist recout.mlf

which will display output similar to this (note: these are results for the 8kHz:16-bit VoxForge Speaker Independent Acoustic Model - build 396):

====================== HTK Results Analysis =======================
  Date: Thu Sep 14 14:11:46 2006
  Ref : testref.mlf
  Rec : recout.mlf
------------------------ Overall Results --------------------------
SENT: %Correct=60.00 [H=30, S=20, N=50]
WORD: %Corr=96.83, Acc=76.19 [H=183, D=0, S=6, I=39, N=189]
===================================================================

What this means is that:

  • for the line starting with SENT, there were 50 test sentences and 60% were correctly recognized. 
  • for the line starting with WORD, there were 189 words in total, of which 96.83% were recognized correctly.  But because Julius recognized words that are not in the audio file (i.e. insertion errors) it only gets a 76.19% accuracy rating.
  • Count definitions:
    • D - Deletion Error
    • S - Substitution Error
    • I - Insertion Error


Comments

Click the 'Add' link to add a comment to this page; click the 'Read More' link to view replies to a posted comment.

AddSearch

how to do the recognition by phonem withe htk????
By person - 9/21/2008 - 1 Replies

In this tutorial you are doing  the recognition by words.
Is it is possible with the same stages to do the recognition by phoneme,  of course  we change the  grammar and the dictionary
 
I tried to do that but the result (recout.mlf) was only the phoneme "sil" for each sentence in test file.
Is the problem that each sentence is begins and finished by a "sil" !!  And the rate of recognition was 2% !!!!

How to make the decoding with the phoneme in htk?? is there a forum to do that in this site web??

 

please help me and thans befor

Julius performance
By DRF - 8/8/2008 - 1 Replies

I went through the steps to test the tutorial but found that the performance of Julius was considerably inferior to what was posted here.  I have not been able to get above %50 word error rate.

In particular, the recognizer seems to have lots of difficulty in identifying a long stream of digits.  It is only able to recognize the first few digits.  My guess is that the settings in the Julian.jconf file might need changing.  Any suggestions?