Click here to register.

Step 2 - Pronunciation Dictionnary

BackGround - Phonetically Balanced Dictionnary 

Usually, the first step in building the Pronunciation Dictionnary is to create a sorted list of the words contained in your Grammar, one per line, with pronunciations (the phonemes that make up a word).  With our current example, it is easy to create an initial one by hand (see Initial Pronunciation Dictionnary). 

However, for HTK to be able to compile your speech audio and transcriptions into an Acoustic Model, HTK requires a phonetically balanced Pronunciation Dictionnary with at the very least 30-40 'sentences' of 8-10 words each.  If your Grammar has fewer sentences/words than this (as we do in this tutorial), or if your grammar in not  phonetically balanced (if some phonemes only occur one or two times) then we need to add additional words to make sure we have 3-5 occurences of each phoneme in our Pronunciation Dictionnary.

Therefore for this tutorial, we will need to add additional words to our Pronunciation Dictionnary in order to permit HTK to compile an Acoustic Model.  Remember, we are only trying to get the minimum number of pronunciation dictionnary entries that will permit HTK to compile - creating an Acoustic Model that produces consistent recognition results requires many more entries, and corresponding speech audio.

Tutorial 

To create a pronunciation dictionnary in HTK we will follow these steps:

  • create a prompts file - which is the list of words we will record in the next Step;
  • derive a wlist file from the prompts file - the wlist file is a sorted list of the unique words that appear in the prompts file.
  • create the pronunciation dictionnary - which is done by adding pronunciation information to the words in wlist.

prompts file 

First we need to create a prompts file that includes our Grammar words and the additional dictionnary words required to create a phonetically balanced dictionnary.  This file basically contains the list of words that need to be recorded, and the names of the audio files the recordings will be stored - one per line.  You will do these recordings in Step 3.

Go to the 'voxforge/manual' folder you created in your home holder and create a file called 'prompts' containing the following:

*/sample1 DIAL ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE OH ZERO
*/sample2 DIAL ONE THREE FIVE SEVEN NINE ZERO TWO FOUR SIX EIGHT OH
*/sample3 DIAL ZERO NINE SEVEN FIVE THREE ONE OH EIGHT SIX FOUR TWO
*/sample4 DIAL ONE ONE TWO TWO THREE THREE FOUR FOUR FIVE FIVE
*/sample5 DIAL SIX SIX SEVEN SEVEN EIGHT EIGHT NINE NINE OH OH ZERO ZERO
*/sample6 PHONE STEVE YOUNG CALL STEVE YOUNG
*/sample7 PHONE STEVE CALL STEVE PHONE YOUNG CALL YOUNG
*/sample8 PHONE PHONE STEVE STEVE  CALL CALL YOUNG YOUNG
*/sample9 MEASURE LEISURE AND LEISURE MEASURE
*/sample10 COMPLAIN CHAMPLAIN AIRPLANE ELAINE EXPLAIN
*/sample11 BOOKENDS KENNEL KENNETH KENYA WEEKEND
*/sample12 BELT BELOW BEND AEROBIC DASHBOARD DATABASE
*/sample13 GATEWAY GATORADE GAZEBO AFGHAN AGAINST AGATHA
*/sample14 ABALON ABDOMINALS BODY ABOLISH
*/sample15 ABOUNDING ABOUT ACCOUNT ALLENTOWN
*/sample16 ACHIEVE ACTUAL ACUPUNCTURE ADVENTURE
*/sample17 ALGORITHM ALTHOUGH ALTOGETHER ANOTHER
*/sample18 BATTLE BEATLE LITTLE METAL
*/sample19 BITTEN BLATANT BRIGHTEN BRITAIN
*/sample20 BROOKHAVEN HOOD BROUHAHA BULLHEADS
*/sample21 BUSBOYS CHOICE COILS COIN
*/sample22 COLLECTION COLORATION COMBINATION COMMERCIAL
*/sample23 MIDDLE NEEDLE POODLE SADDLE
*/sample24 ALRIGHT ARTHRITIS BRIGHT COPYRIGHT CRITERIA RIGHT
*/sample25 COUPLE CRADLE CRUMBLE
*/sample26 CUBA CUBE CUMULATIVE
*/sample27 CURING CURLING CYCLING
*/sample28 CYNTHIA DANFORTH DEPTH
*/sample29 DIGEST DIGITAL DILIGENT
*/sample30 AMNESIA ASIA AVERSION BEIGE BEIJING
*/sample31 HELP HELLO HELMET HELPLESS AHEAD HELP

The first column of the prompts file contains the name of the audio file to be created, and the following columns contain the text transcriptions of what to be recorded in the audio file.

wlist file 

The HTK Perl script prompts2wlist can take the prompts file you just created, and remove the file name in the first column and print each word on one line into a word list file (wlist).  You should already have a folder in your 'voxforge' directory called 'HTK_scripts'.  Confirm that the prompts2wlist script exists there.  Then from your 'voxforge/manual' directory execute the following:

 $perl ../HTK_scripts/prompts2wlist prompts wlist

This will create the wlistwlist file. 

Next, you need to manually add the following entries to your wlist file (in sorted order):

SENT-END
SENT-START  

These are HTK internal entries required for creation of the Acoustic Model, and for processing of the Acoustic Model by Julius.   Your file should look like this: wlist2wlist

pronunciation dictionnary 

The next step is to add pronunciation information (i.e. the phonemes that make up the word) to each of the words in the wlist file, thus creating a Pronunciation Dictionnary.  HTK uses the HDMan command to go through the wlist file, and look up the pronunciation for each word in a separate lexicon file, and output the result in a Pronunciation Dictionnary. 

First you need to create the global.ded script in your 'voxforge/manual' folder (default script used by HDMan), which contains:

AS sp
RS cmu
MP sil sil sp

This is mainly used to convert all the words in the dict file to uppercase.  See the HTK book for details of what these commands mean.

Create a new directory called 'lexicon' in your 'voxforge' folder.  Create a new file called voxforge_lexicon in your 'voxforge/lexicon' folder, and copy the into it: voxforge_lexicon    (this is a modified version of the Pronunciation Dictionnary included with ISIP Switchboard corpus).  Execute the HDMan command from your 'voxforge/manual' directory as follows:

$HDMan -A -D -T 1 -m -w wlist -n monophones1 -i -l dlog dict ../lexicon/voxforge_lexicon

The output of the above noted HDMan command is two files:

  • dictdict - the pronunciation dictionnary for you Grammar and additional words required to create a phonetically balanced Acoustic Model; and
  • monophones1monophones1, which is simply a list of the phones used in dict.

Confirming Phonetically Balanced Dictionnary

To help you determine your if dictionnary is phonetically balanced, review the output from your HDMan command in the 'dlog' log file: 

WARNING: no script file ../lexicon/voxforge_lexicon.ded

Dictionary Usage Statistics
---------------------------
  Dictionary    TotalWords WordsUsed  TotalProns PronsUsed
voxforge_lex     27380        114      27431        114
        dict       114        114        114        114

114 words required, 0 missing

New Phone Usage Counts
---------------------
  1. ae    :    18
  2. b     :    32
  3. ax    :    44
  4. l     :    42
  5. aa    :     9
  6. n     :    39
  7. sp    :   112
  8. d     :    26
  9. m     :    13
 10. ih    :    33
 11. z     :     7
 12. sh    :     7
 13. aw    :     4
 14. ng    :     7
 15. t     :    32
 16. k     :    32
 17. ch    :     5
 18. iy    :    14
 19. v     :     8
 20. uw    :     8
 21. y     :     8
 22. p     :    11
 23. ah    :     8
 24. er    :     9
 25. eh    :    23
 26. r     :    25
 27. ow    :    11
 28. f     :     5
 29. g     :     8
 30. s     :    15
 31. th    :     7
 32. hh    :    10
 33. ey    :    20
 34. dh    :     4
 35. ao    :     4
 36. ay    :    12
 37. zh    :     6
 38. el    :     6
 39. jh    :     4
 40. en    :     4
 41. uh    :     5
 42. oy    :     4
 43. w     :     3
 44. sil   :     2

Dictionary dict created


Although reviewing this log will not conclusively determine whether you have a phonetically balanced pronunciation dictionnary or not (because it may be missing certain phones altogether because your grammar is so small), it is a good place to start. 

For HTK to compile your Acoustic Model, you need to make sure that you have (at the very least) 3 to 5 usage counts for each phone.  If there are phones that only have one occurence, you must add words that use these phones to your prompts file.  You can search through the lexicon file for the phones you need, and then include the word that contains that phone.

Creating Monophones0 File

You also need another monophones file for a later Step.  Simply copy the "monophones1" file to a new "monophones0" file in your 'voxforge/manual' directory and then remove the short-pause "sp" entry in monophones0monophones0.


Comments

Click the 'Add' link to add a comment to this page; click the 'Read More' link to view replies to a posted comment.

AddSearch

HDMAN
By erica - 4/15/2008 - 3 Replies

Hi,

    I all, I'm not able to use the HDMAN tool. When I try to run the script, I receive a message saying that "bash: -A: command not found". I got the same for all other command.

What I missed? 

 

Thanks, 

Fetching voxforge_lexicon
By colbec - 3/3/2008 From Firefox, right click and save the file locally. If you left click and allow the text to fill your browser window and then cut and paste, this results in failure of the next step with '114 words missing'.

How to create the prompts?
By Shawn - 10/8/2007 - 1 Replies

In Step 2, here gives the contents of the file prompts. In this tutorial we can just copy them and create the required file. But if we create a new project, how to create the prompts? Manually or by some HTK command?

 Now, I create a new project . When I execute the command HDMan, and the file dlog shows that so many phones have ONLY Y one occurence. In the tutorial here gives the tips:"if there are phones that only have one occurence, you must add words that use these phones to your pronunciation dictionnary.  You can search through the lexicon file for the phones you need, and then include the word that contains that phone. "

 Does this means that we should add additional words to the prompts? Or add  to the created file "dict"? I miss that.

In my opinion, if the "prompts" file contains so many words that not only the words  appear in the grammar, then the  Pronounciation Dictionary may reach to the requirement.