I am going to try to adapt the VoxForge Acoustic Model and create my own English Language Model, but I have some questions.
Step 1: Language Model
Obtain pre-compiled corpora, mainly from NLTK
The NLTK book also references additional corpora that I'm going to try and hunt down, and the HTK book ships with 50 volumes of Sherlock Holmes. Any recommendations for useful pre-compiled corpora?
I also want to attempt compiling my own corpora from unstructured text.
Lastly, is accuracy ultimately important? Can a relatively small error (say 0.005) have a noticeable impact on speech recognition?
My most important concern in creating a Language Model is dealing with words not contained in the vocabulary file. Should I delete any sentences containing unknown words? Is an open vocabulary language model (containing unknown words) acceptable to julius, or does a closed vocabulary language model (no unknown words) perform better?
For the record, I am following the walkthrough in chapter 15 of the HTK Book.
Step 2: expand the VoxForge Speaker Independent Acoustic Model Dictionary, using NLTK to retrieve sentences containing the most common words not included in the VoxForge dictionary file.
Several problems, however:
Step 3: adapt the VoxForge Speaker Independent Acoustic Model to my voice
--- (Edited on 6/1/2012 12:43 am [GMT-0500] by Visitor) ---
> Any recommendations for useful pre-compiled corpora?
> Lastly, is accuracy ultimately important?
> Can a relatively small error (say 0.005) have a noticeable impact on speech recognition?
> Is an open vocabulary language model (containing unknown words) acceptable to julius
From the decoder point of view it doesn't matter, it just affects the size of the language model. Decoder only considers words from a dictionary even if language model has some other words.
> The walk-through linked from GRAMMAR_NOTES of the quickstart does not exist: http://www.voxforge.org/home/acousticmodels. How do I go about adding additional words to an existing acoustic model?
This link from the latest quickstart works perfectly
>What is the difference between "lexicon/voxforge/VoxForgeDict" and "HTK_AcousticModel/dict" and which one should I use?
Dict includes only the words from the training prompts. VoxForgeDict is larger and includes all the words. You need to select which one you need to use according to your usage patter. Usually VoxFrogeDict is better just because it's bigger.
>Is recording entire sentences (non-voxforge prompts) allowed?
What do you mean by "allowed" here? There is no law around, it's a software.
> What phonetic alphabet is used when adding new words?
The one which is used in the dictionary already. It doesn't have any specific name
> Which chapter of the HTK Book explains the HHEd edit scriptcommands?
> Any pointers for using HTS (a patched version of HTK) 3.4.1?
--- (Edited on 6/3/2012 16:15 [GMT+0400] by nsh) ---