Step 4 - Creating the Transcription Files

Words Level Transcriptions

HTK toolkit cannot process your prompts.txt file directly. You have two options, you can create a separate 'label' file for each line your prompts.txt file in the following format:

*/sample1
DIAL
ONE
TWO
THREE
FOUR
FIVE
SIX
SEVEN
EIGHT
NINE
OH
ZERO

Or you can create a Master Label File (MLF) - which is a single file that contains a label entry for each line in your prompts.txt file. This is the easiest approach, and the one we will use for this tutorial.

Download the Julia script prompts2mlf.jl to your voxforge/bin directory to generate the mlf file from your prompts.txt file. Execute the prompts2mlf script from your 'voxforge/tutporial' folder as follows:

julia ../bin/prompts2mlf.jl prompts.txt words.mlf

This script generates a words.mlf file.

Phone Level Transcriptions

Next you need to execute the HLEd command to expand the Word Level Transcriptions to Phone Level Transcriptions - i.e. replace each word with its phonemes, and put the result in a new Phone Level Master Label File This is done by reviewing each word in the MLF file, and looking up the phones that make up that word in the dict file you created earlier, and outputing the result in a file called phones0.mlf (which will not have short pauses ("sp"s) after each word phone group).

First, create the mkphones0.led edit script in your 'voxforge/tutorial' folder:

EX
IS sil sil
DE sp

(note: remember to include a blank line at the end of this script)

Then execute the following HLEd command from your 'voxforge/tutorial' folder:

Linux:

$HLEd -A -D -T 1 -l '*' -d dict -i phones0.mlf mkphones0.led words.mlf

Windows:

C:>HLEd -A -D -T 1 -l * -d dict -i phones0.mlf mkphones0.led words.mlf

Which creates the phones0.mlf file

Next, we need to create a second phones1.mlf file (which will include short pauses (“sp”) after each word phone group). First create the mkphones1.led in your 'voxforge/tutorial' folder as follows:

EX
IS sil sil

(note: remember to include a blank line at the end of this script)

Then run the HLEd command again from your 'voxforge/tutorial' folder as follows:

Linux:

$HLEd -A -D -T 1 -l '*' -d dict -i phones1.mlf mkphones1.led words.mlf

Windows:

C:>HLEd -A -D -T 1 -l * -d dict -i phones1.mlf mkphones1.led words.mlf

Which creates the phones1.mlf file.

Comments

Regarding creation of proto file

By Gururaj - 12/18/2016 hi all..

Regarding HLEd command..

By Gururaj - 12/15/2016 - 13 Replies Hello everyone,

Regarding MLF file creation

By Gururaj - 12/13/2016 - 4 Replies Hi all,

HTK configuration parameters eet

By Wageesha - 11/8/2016 How to set HTK configuration parameters when running HLEd -A -D -T 1 -l * -d dict -i phones0.mlf mkphones0.led words.mlf command in voxforge/tutorial folder?

what is dict file

By ehsan moradi - 7/8/2016 - 1 Replies hello what is dict file

"sp" not include in phones1

By angga - 5/17/2014 - 1 Replies hello..can you help me?? i have follow your instruction in HLEd phones1 but "sp" not include,,, why??

Error [+6550]

By Mohammad Zaki - 9/3/2013 Hi,

Error 1232

By Asim12 - 1/9/2013 - 1 Replies After running this command HLEd -l '*' -d dict -i phones0.mlf mkphones0.led words.mlf. I am getting error that work > cannot be found in dicitionary.

several possible pronunciations of a word

By bejimed - 2/7/2011 - 1 Replies If there are several possible pronunciations of a word in the pronunciation dictionary how i can create my phones file because

HLEd error for French data set

By amza - 7/18/2010 - 3 Replies Hi,

How to deal wiht TIMIT prompts，Need Help！

By spring - 12/19/2009 - 6 Replies Hi,ken

VERY IMPORTANT!! I found the error ponts.

By Sugarune - 12/12/2009 I found the error ponts at 2 script line.

Time Stamped Label Files

By chandu - 10/4/2009 - 7 Replies hello all,

mkphones0.led

By Adam - 11/9/2008 - 1 Replies HERest breaks (first time we reestimate) if this script includes the IS sil sil line in the first script. Perhaps I am incorrect, but I would recommend this modification.

Number as label

By Manuel - 9/3/2007 - 3 Replies When you execute: $HLEd -A -D -T 1 -l '*' -d dict -i phones0.mlf mkphones0.led words.mlf if you have some number in the word of the file words.mlf this are not recognize as label, for example if I have : "*/sample1" 1 + 3 HLEd not work, it's ok the follow: "*/sample1" ONE + THREE How can I resolve this? Tks


Username	Password