Acoustic Model Discussions

Re: Error when compiling model - NewMacro: macro or model name ST_O_2_1 already exists
User: kmaclean
Date: 8/16/2007 4:16 pm
Views: 455
Rating: 42

Hi Peter,

after a bit of research, here is what I found: 

1.  Creating clustered triphone "questions"

Found an excellent overview of question creation for HTK on the  ISLE (Illinois Speech and Language Engineering) site.  From Lecture 6 - HMM Refinement (Speech Recognition Tools mini-course taught by Mark Hasegawa-Johnson):

[...]  “Clustered triphones” are triphones that depend not on the phoneme labels of both neighbors, but instead, only on the class of the neighboring phones; for example, given that the neighboring phone is a vowel, /k/ might only be sensitive to whether it is a front vowel or a back vowel.

[HTK's] tree-based clustering algorithm accepts a long list of allowable phonetic class distinctions, or “questions” phrased in the following form:

QS "L_Nasal" { ng-*,n-*,m-* }

QS "R_Nasal" { *+ng,*+n,*+m }

The ?rst “question” speci?es that if the left phone is /m,n,ng/, then it is nasal, otherwise not. The second “question” speci?es a similar distinction for the right phone. Obviously, there are an enormous number of possible phonetic distinctions that one might ask about. The HHEd command TB examines the statistics of the training corpus, in order to determine which of these possible questions is most useful for each phoneme.

He then goes on to described the following steps in tree-based clustering (I am paraphrasing his description of the process ... see his original article for details): 

Step 1 Generate a statistics ?le

Execute HERest to accumulate the statistics necessary for tree-based clustering.  This creates  inmmf/triphone.stats, and a new set of models in mmf/t2/triphones:

HERest -A -s mmf/triphone.stats -I mlf/triphones.mlf -B  -C cfg/train.cfg -S scp/train.scp -H mmf/triphones -M mmf/t2 lists/triphones.txt;

Step 2 Create the HHEd edit ?le

Create the a new file called cluster.hhed (HHEd edit ?le). The ?rst line in this new file reads in the statistics:

RO 20.0 "mmf/k01i1/triphone.stats

Next you have many lines of questions:

QS "L_Fricative" { s-*,f-*,th-*,sh-*,z-*,v-*,dh-*,zh-* }


QS "R_Palatovelar" { *-sh,*-zh,*-y,*-k,*-g }

Next you have the particular states that you want to consider clustering.  This uses a series of "TB commands" to specify:

  • number of tokens per output leaf,
  • the base name of the macros that will be tied together, and
  • the list of states or HMMs that may be tied together.

For example, the following command tells HTK to consider how best to cluster the second states of all of the di?erent triphones based on the gender-dependent phone aa_m:

TB 100.0 "aa_mS2" {(aa_m,*-aa_m,aa_m+*,*-aa_m+*).state[2]}

The TB commands "grow the trees".

The AU command actually creates the new clustered-triphone acoustic models, by merging models from the input triphone list (in this cases lists/triphones.txt).

Finally, the ST command writes out the trees, and the CO command is used to write out a list of the new clustered-triphone models:

ST lists/triphone_trees.txt

CO lists/clustered.txt

Step 3 Run HHEd

This step implements the commands in your edit ?le, and to write out the resulting models to the ?le mmf/clustered:

HHEd -A -B -T 1 -H mmf/triphones -w mmf/clustered ed/cluster.hhed lists/triphones.txt

2. German Pronunciation Lexicon

The Phonetik BAS (Bavarian Archive for Speech Signals) has a pronunciation lexicon  called PHONOLEX.  The web page has a link to Bas-Sampa - which includes phoneme groupings that might be useful for the creation of tree clusters.  I think this might be useful for creating HTK "questions" for German. 

Hope this helps, 


--- (Edited on 8/16/2007 5:16 pm [GMT-0400] by kmaclean) ---

--- (Edited on 5/29/2015 3:25 pm [GMT-0400] by kmaclean) ---

Re: Error when compiling model - NewMacro: macro or model name ST_O_2_1 already exists
User: Peter Grasch
Date: 8/20/2007 4:14 am
Views: 2879
Rating: 39
Hi Ken! First of all: thanks for putting all that effort into it! I already read that the htk questions were language-specific but I was hoping someone else already did the work :) PHONOLEX seems interresting but as it semms to be commercial I'd rather stick with the BOMP dict. if possible. We'll look into creating the questions ourselves but it seems like a difficult and time consuming task... Thanks again for your time! -- Peteer

--- (Edited on 8/20/2007 4:14 am [GMT-0500] by Visitor) ---