VoxForge
Re: Librivox contributions and dates/numbers
Thanks for that, Dr Tony. Of course the topic of the thread is building an audio model, in this case using Librivox data. As I understand it we are giving the model builder as many different opportunities to hear triphones as possible. And the triphone patterns are defined in the lexicon.
But the lexicon is also the palette from which you build a grammar. So any decision taken related to the lexicon has implications in a later process, recognition.
If my lexicon contains SAY s eh, ONE w uh n, WON w uh n, TO t uw, TWO t uw, and I try to use prompts SAY ONE, SAY WON, SAY TO, SAY TWO, I find it hard to imagine that even given infinite amount of data a recognizer would ever be able to do better than 50-50 on SAY ONE. Perhaps I am wrong here?
My example is a bit academic, you can design your grammar with the lexicon weaknesses in mind, or ask the recognizer for the top two possibilities and deal with the outcome in the DM.
--- (Edited on 5/20/2012 6:03 am [GMT-0500] by colbec) ---