Step 8 - Realigning the Training Data

Background 

This operation is similar to the HLEd word-to-phone mapping operation performed in the Step 4, however in this case the HVite command can consider all pronunciations for each word (in the case where a word has more than one pronunciations), and then output the pronunciation that best matches the acoustic data.

Steps 

Execute the HVite command as follows:

Linux:

HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/hmmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.scp dict monophones1> HVite_log

Windows

HVite -A -D -T 1 -l * -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/hmmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.scp dict monophones1> HVite_log

This creates the aligned.mlf file.

Review the output of the HVite command very carefully.  Catching errors here will save a lot of headache later on.   Because seemingly minor problems at this step sometimes show up as major errors at later steps, and they are very difficult to trace back to here.  Here is the log output from the above noted command: hvite_log.  It is time well spent to review the log to make sure that HVite recognized all the words for each line in your prompts file.

Next run HERest 2 more times: 

HERest -A -D -T 1 -C config -I aligned.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm7/macros -H hmm7/hmmdefs -M hmm8 monophones1

The files created by this command are:

HERest -A -D -T 1 -C config -I aligned.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm8/macros -H hmm8/hmmdefs -M hmm9 monophones1

The files created by this command are:

Note: the monophone models created in hmm9 could actually be used with Julius for speech recognition, but recognition accuracy can be greatly improved by using Tied-State triphones - see next sections.

 

Comments

By doaa - 5/3/2019 how to choose the best beam purning threshold to obtain higher accuracy?

By Nazik - 3/10/2018 - 2 Replies Hi all

By mrageshrajan - 12/29/2016 - 2 Replies ERROR [+7332] CreateInsts: Cannot have successive Tee models

By ibr - 8/16/2015 hi

HTK
By faha001 - 5/28/2015 command for creating dictionary exectuted without errors but no words are not in the dictionary.how can I fix this?

HTK
By faha001 - 5/28/2015 I got this error at my lexicon file word three out of order in dict lexicon.txt.how to fix this?

By Shipra - 10/27/2014 HTK HVite command error. i got it. but problem is that i have created hmm model for short pause 'sp' but 'sp' is not in my dictionary. How can i add 'sp' to my dictionary for isolated word recognition at word level

By Shipra - 10/27/2014 Hi,

By brsgrlr - 5/30/2014 Hi,

By tomvdh - 4/29/2014 - 1 Replies Hi, Got everything working until here but curious what an error in de log-file would look like and what causes it. For me it's just exactly the same as we produced in Step 2 and what we recorded later. Is it possible that a word is missing in the log file for example? Thanks!

By yogie - 7/31/2013 - 2 Replies hi,

By Amber Afshan - 5/17/2013 I am trying the tutorial example in the book. When i do the step for Realigning the training data HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/hmmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.scp dict monophones1 I get the following error Error[+5013] Read String : String too long What might be the problem here? I followed the procedure here for the same error, on dict and monophones1 file http://www.ling.ohio-state.edu/~bromberg/htk_problems.html

By calel - 4/15/2013 QUOTE:

By fabrice - 8/29/2012 - 3 Replies Hello,

By Yulan - 8/1/2012 - 1 Replies Hi guys, my problem has been solved!

By Yulan - 7/31/2012 - 1 Replies I am using HTK 3.4.1 in Ubuntu 12.04. I followed the guide in Vox Forge. However I got this error in re-aligning:

By Nj - 5/2/2012 - 1 Replies when i am running $HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/hmmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.scp dict monophones1> HVite_log this commands in the console there is a message that Cannot open file2873.vec from train.scp.

By sunny - 2/17/2012 Hi, I have got the following error in the HERest command to get hmm8:

By albert - 1/11/2012 - 1 Replies hi all,

By Shanika - 5/17/2011 - 8 Replies hi all,

hi
By mmm - 7/21/2010 - 2 Replies hi everybody

By mmm - 6/6/2010 - 1 Replies HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/hmmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.scp dict monophones1> HVite_log

By Philip - 3/10/2010 - 1 Replies As i executed HVite in step 8 , the error is as follow

By Saran - 2/15/2010 - 1 Replies While executing the HVite command of step 8. I get the following error

By Tronz - 1/30/2010 - 4 Replies Hi

By puphe_88 - 1/1/2010 - 1 Replies when i running

By PROYECTOSI - 12/9/2009 - 1 Replies According to the Step 8:

By RedCisc - 7/3/2009 - 2 Replies I get either +3219 Bad Switch END; must be single letter or +8220 LatticeFromLabels: Word SENT-END not defined in dictionary

By tpavelka - 4/23/2009 - 5 Replies It may happen (usually during a big mismatch between transcript and the actual speech in the recording) that no tokens reach the end of the utterence due to pruning. In that case the sentence is not included in aligned.mlf, but HTK does not report any errors. When HERest is run again, it throws an error and ends because it cannot find the transcription of the sentence in aligned.mlf.

By vkb - 10/26/2008 - 2 Replies I get the following error while re-aligning using HVite command: