Step 2 - Create Test Prompts

Julian Grammar

Copy these Julian grammar files to your test directory (you will need them to test your Acoustic Model using Julius):

  • sample.grammarsample.grammar
  • the '.voca' file to depends on which dictionnary the Acoustic Model was trained with:
    • If you are testing an Acoustic Model created using the How-to or Tutorial (using ISIP Switchboard dictionnary), use this file: sample.vocasample.voca
    • If you are testing the VoxForge Acoustic Model adapted to your voice (using CMU dictionnary), use this file: sample.vocasample.voca

(these are similar to the ones you created in the VoxForge How-to or Tutorial).

HTK Grammar

2. Create a new file called 'gram' in your test directory, and add the following to it:

$digit= ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE | OH | ZERO;
$name = [ STEVE ] YOUNG;
( SENT-START (DIAL<$digit> | (PHONE|CALL) $name) SENT-END )

This single file is functionally the same as the Julian sample.grammar and sample.voca files listed above.  HTK's format is much more compact because it does not need  pronunciation information.

2. Run the following command to create word network file:
$ HParse gram wdnet

this creates a wdnetwdnet file.

Next, you have 2 options:

  • use the VoxForge predefined test prompts; or
  • create your own tests prompts. 

Use predefined test prompts

1. Copy this file: testpromptstestprompts to your 'voxforge/test' directory.

2. Convert this testprompts file to an mlf file that HTK can process.  Execute the prompts2mlf script from your 'voxforge/manual' folder as follows:

$perl ../HTK_scripts/prompts2mlf testref.mlf testprompts

This script generates a testref.mlftestref.mlf file.

3. Then go to next step.

- or -

Create your own test prompts 

Generate Test Prompts

1. Use the HTK command HSGen to generate random test prompts as follows:

$HSGen -l -n 50 wdnet ../lexicon/voxforge_lexicon > testprompts 

This creates a file called testprompts (note: your prompts will be different than these).

Note:
1. any numeric prompts were manually truncated to seven digits (this is because we need much more speech audio data to make our acoustic models more robust).
2. Julius has a "Generate" command that is similar to the HTK HSGen command, but HSGen seems to generate prompts that have better coverage.

2. Add the fixtestprompts.pl script to your voxforge/test directory (note that if you download this file, you need to rename it to 'fixtestprompts.pl' - otherwise it will download as 'fixtestprompts_pl.txt').

3. Run the fixtestprompts.pl script to add a file name to the beginning of each line in the testprompts file you just created, as follows (note: you may need to make this script executable - see Cheat Sheet on the Docs page): 

$./fixtestprompts.pl testprompts testpromptsout

The script outputs to the testpromptsout file. 

4. You need to rename the testpromptsout file back to testprompts as follows:

$mv testpromptsout testprompts 

Your prompts file should look like this: testpromptstestprompts.

5. Convert your testprompts file to an mlf file that HTK can process.  Execute the prompts2mlf script from your 'voxforge/manual' folder as follows:

$perl ../HTK_scripts/prompts2mlf testref.mlf testprompts

This script generates a testref.mlftestref.mlf file.


Comments

Click the 'Add' link to add a comment to this page.

Note: You need to be logged in to add a comment!

Search

By mmm - 8/2/2010 - 579 Replies

hi

 

can anyone explain to me how i can create gram file?

the point which i do not understand the symbols such as

()

}{

<>

i tried to understand them by reading htk book,but i am still confused.

if i have for example

3 parts of word

word1  word2 word3

word 1 and word 3 may happened zero or more

word2 happened alwasy

how i can create gram file???

 

please help