Usually, the first step in building the Pronunciation Dictionnary is
to
create a sorted list of the words contained in your Grammar, one per
line, with pronunciations (the phonemes that make up a word). With our current example, it is easy
to create an initial one by
hand (see Initial Pronunciation Dictionnary).
However, for HTK to be able to compile your speech audio and transcriptions into an
Acoustic Model, HTK requires a phonetically balanced Pronunciation Dictionnary with at the very least 30-40 'sentences'
of 8-10 words each. If your Grammar has fewer sentences/words than this (as
we do in this tutorial), or if your grammar in not phonetically balanced (if some phonemes only occur one or two times) then we need to add additional words to make
sure we have 3-5 occurences of each phoneme in
our Pronunciation Dictionnary.
Therefore for this tutorial, we will need to add additional
words to our Pronunciation Dictionnary in order to permit HTK to compile an Acoustic Model. Remember,
we are only trying to get the minimum number of pronunciation
dictionnary entries that will permit HTK to compile - creating an Acoustic Model
that produces consistent recognition results requires many more entries,
and corresponding speech audio.
Tutorial
To create a pronunciation dictionnary in HTK we will follow these steps:
create a prompts file - which is the list of words we will record in the next Step;
derive a wlist file from the prompts file - the wlist file is a sorted list of the unique words that appear in the prompts file.
create the pronunciation dictionnary - which is done by adding pronunciation information to the words in wlist.
prompts file
First we need to create a prompts file that includes our Grammar words and
the additional dictionnary words required to create a phonetically
balanced dictionnary. This file basically contains the list of
words that need to be recorded, and the names of the audio files the
recordings will be stored - one per line. You will do these
recordings in Step 3.
Go to the 'voxforge/manual' folder you created in your home holder and create a file called 'prompts' containing the following:
*/sample1 DIAL ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE OH ZERO
*/sample2 DIAL ONE THREE FIVE SEVEN NINE ZERO TWO FOUR SIX EIGHT OH
*/sample3 DIAL ZERO NINE SEVEN FIVE THREE ONE OH EIGHT SIX FOUR TWO
*/sample4 DIAL ONE ONE TWO TWO THREE THREE FOUR FOUR FIVE FIVE
*/sample5 DIAL SIX SIX SEVEN SEVEN EIGHT EIGHT NINE NINE OH OH ZERO ZERO
*/sample6 PHONE STEVE YOUNG CALL STEVE YOUNG
*/sample7 PHONE STEVE CALL STEVE PHONE YOUNG CALL YOUNG
*/sample8 PHONE PHONE STEVE STEVE CALL CALL YOUNG YOUNG
*/sample9 MEASURE LEISURE AND LEISURE MEASURE
*/sample10 COMPLAIN CHAMPLAIN AIRPLANE ELAINE EXPLAIN
*/sample11 BOOKENDS KENNEL KENNETH KENYA WEEKEND
*/sample12 BELT BELOW BEND AEROBIC DASHBOARD DATABASE
*/sample13 GATEWAY GATORADE GAZEBO AFGHAN AGAINST AGATHA
*/sample14 ABALON ABDOMINALS BODY ABOLISH
*/sample15 ABOUNDING ABOUT ACCOUNT ALLENTOWN
*/sample16 ACHIEVE ACTUAL ACUPUNCTURE ADVENTURE
*/sample17 ALGORITHM ALTHOUGH ALTOGETHER ANOTHER
*/sample18 BATTLE BEATLE LITTLE METAL
*/sample19 BITTEN BLATANT BRIGHTEN BRITAIN
*/sample20 BROOKHAVEN HOOD BROUHAHA BULLHEADS
*/sample21 BUSBOYS CHOICE COILS COIN
*/sample22 COLLECTION COLORATION COMBINATION COMMERCIAL
*/sample23 MIDDLE NEEDLE POODLE SADDLE
*/sample24 ALRIGHT ARTHRITIS BRIGHT COPYRIGHT CRITERIA RIGHT
*/sample25 COUPLE CRADLE CRUMBLE
*/sample26 CUBA CUBE CUMULATIVE
*/sample27 CURING CURLING CYCLING
*/sample28 CYNTHIA DANFORTH DEPTH
*/sample29 DIGEST DIGITAL DILIGENT
*/sample30 AMNESIA ASIA AVERSION BEIGE BEIJING
*/sample31 HELP HELLO HELMET HELPLESS AHEAD HELP
The
first
column of the prompts file contains the name of the audio file to be
created, and the
following columns
contain the text transcriptions of what to be recorded in the audio
file.
wlist file
The HTK Perl script prompts2wlist can take
the prompts file you just created, and remove the file name in the
first column and print each
word on one line into a word list file (wlist).
You should already have a folder in your 'voxforge' directory called
'HTK_scripts'. Confirm that the prompts2wlist script exists
there. Then from your 'voxforge/manual' directory execute the
following:
Next, you need to manually add the following entries to your wlist file (in sorted order):
SENT-END
SENT-START
These are HTK internal entries required for creation of the Acoustic
Model, and for processing of the Acoustic Model by Julius.
Your file should look like this:
wlist
pronunciation dictionnary
The next step is to add pronunciation information (i.e. the phonemes
that make up the word) to each of the words in the wlist file, thus
creating a Pronunciation Dictionnary. HTK
uses the HDMan command
to go through the wlist file, and look
up the pronunciation for each word in a separate lexicon file, and
output the result in a Pronunciation Dictionnary.
First you need to create the global.ded script in your 'voxforge/manual' folder (default script used by HDMan), which contains:
AS sp RS cmu MP sil sil sp
This is mainly used to convert all the words in the dict file
to uppercase. See the HTK book for details of what these commands
mean.
Create a new directory called 'lexicon' in your 'voxforge'
folder. Create a new file called voxforge_lexicon in your 'voxforge/lexicon' folder,
and copy the into it:
voxforge_lexicon
(this is a modified version of the
Pronunciation Dictionnary included
with ISIP Switchboard corpus). Execute the HDMan command from your 'voxforge/manual' directory as follows:
New Phone Usage Counts
---------------------
1. ae : 18
2. b : 32
3. ax : 44
4. l : 42
5. aa : 9
6. n : 39
7. sp : 112
8. d : 26
9. m : 13
10. ih : 33
11. z : 7
12. sh : 7
13. aw : 4
14. ng : 7
15. t : 32
16. k : 32
17. ch : 5
18. iy : 14
19. v : 8
20. uw : 8
21. y : 8
22. p : 11
23. ah : 8
24. er : 9
25. eh : 23
26. r : 25
27. ow : 11
28. f : 5
29. g : 8
30. s : 15
31. th : 7
32. hh : 10
33. ey : 20
34. dh : 4
35. ao : 4
36. ay : 12
37. zh : 6
38. el : 6
39. jh : 4
40. en : 4
41. uh : 5
42. oy : 4
43. w : 3
44. sil : 2
Dictionary dict created
Although reviewing
this log will not
conclusively determine whether you have a phonetically balanced pronunciation dictionnary
or not (because it may be missing
certain phones altogether because your grammar is so small), it is
a good place to start.
For HTK to compile your Acoustic Model, you need to make sure that you have (at the very least) 3 to 5 usage counts for each
phone. If there
are phones that only have one occurence, you must add words that use
these phones to your prompts file. You can search through the
lexicon file for the phones you need, and then include the word that contains that phone.
Creating Monophones0 File
You
also need another monophones file for a later Step. Simply copy
the "monophones1" file to a new "monophones0" file in your 'voxforge/manual' directory and then remove the
short-pause "sp" entry in
monophones0.
Comments
Click the 'Add' link to add a comment to this page; click the 'Read More' link to view replies to a posted comment.