Click here to register.

Step 7 - Fixing the Silence Models

Background 

In the last step you created HMM models that did not include an "sp" (short pause) silence model - which refers to the types of short pauses that occur between words in normal speech.  However, you did create a "sil" silence model - sil silence models are typically of longer duration, and refer to the pauses occur at the end of a sentence.   

The HTK book says that the sp model needs to have its "emitting state tied to the centre state of the silence model".  What this means is that you need to create a new sp model in your hmmdefs, that it will use the centre state of sil, and then they both need to be 'tied' together.  For a bit of background on HMMs and states, see this example

This can be done by copying the centre state from the sil model in your hmmdefs file and adding it to the sp model, and then running a special tool called HHED to 'tie' the sp model to the sil model so that they share the same centre state.  The HTK book provides some background on what this means, but you need an understanding of the basics of Hidden Markov Modelling before tackling the HTK Book explanations (the University of Leeds HMM tutorial provides a very good tutorial on Hidden Markov Modelling).

Note: you do not need to understand HMMs to complete this tutorial.

Tutorial 

First copy the contents of the hmm3 folder to hmm4.  Then using an editor, create new "sp" model in hmm4/hmmdefs as follows:

  • copy and paste the “sil” model and rename the new one “sp”
  • remove state 2 and 4 from new “sp” model (i.e. keep 'centre state' of old “sil” model in new “sp” model)
  • change <NUMSTATES> to 3
  • change <STATE> to 2
  • change <TRANSP> to 3
  • change matrix in <TRANSP> to 3 by 3 array
  • change numbers in matrix as follows:
 0.0 1.0 0.0
 0.0 0.9 0.1
 0.0 0.0 0.0

Your sp model should look something like this:

~h "sp"
<BEGINHMM>
<NUMSTATES> 3
<STATE> 2
<MEAN> 25
 -7.046570e+00 -3.262981e-01 -1.706483e+00 -1.080971e+00 -1.134529e+00 3.588506e+00 3.917166e+00 1.443405e+00 4.899211e+00 3.409961e+00 8.219168e-01 3.644213e+00 -7.641904e-02 -6.077167e-02 2.118241e-01 -8.631640e-02 3.686112e-02 8.506200e-02 -8.106526e-02 1.066912e-02 1.281262e-01 -1.437282e-01 -3.412217e-02 1.333326e-01 1.202221e-01
<VARIANCE> 25
 7.911258e+00 8.348815e+00 1.148870e+01 1.213321e+01 8.655976e+00 1.509970e+01 9.904381e+00 1.166922e+01 1.025182e+01 8.845907e+00 8.135198e+00 9.622693e+00 9.084668e-01 7.631339e-01 1.614822e+00 9.755048e-01 7.167343e-01 1.691362e+00 1.297928e+00 9.801642e-01 1.225108e+00 1.051384e+00 9.349809e-01 1.529028e+00 5.576642e-01
<GCONST> 7.411308e+01
<TRANSP> 3
 0.0 1.0 0.0
 0.0 0.9 0.1
 0.0 0.0 0.0
<ENDHMM>

Your files should look like this:

Next, run the HMM editor called HHEd to "tie" the sp state to the sil centre state - tying means that one or more HMMs share the same set of parameters.  To do this you need to create the following HHEd command script, called sil.hed, in your voxforge/manual folder:

AT 2 4 0.2 {sil.transP}
AT 4 2 0.2 {sil.transP}
AT 1 3 0.3 {sp.transP}
TI silst {sil.state[3],sp.state[2]}

The last line is the "tie" command.  Next run HHEd as follows, but using the monophones1 file which contains the sp model:

$HHEd -A -D -T 1 -H hmm4/macros -H hmm4/hmmdefs -M hmm5 sil.hed monophones1

The files created by this command are:

Next run HERest 2 more times, this time using the monophones1 file: 

$HERest -A -D -T 1 -C config  -I phones1.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm5/macros -H  hmm5/hmmdefs -M hmm6 monophones1

The files created by this command are:

 

$HERest -A -D -T 1 -C config  -I phones1.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm6/macros -H hmm6/hmmdefs -M hmm7 monophones1

The files created by this command are:

 


Comments

Click the 'Add' link to add a comment to this page; click the 'Read More' link to view replies to a posted comment.

AddSearch