The Spice Project provides anyone with the ability to create an Acoustic Model, Language Model and Dictionary in the language of their choice, for use with the Janus Speech Recognition Engine. The site is still under construction, but shows a novel approach to addressing the problem of creating speech recognition systems for the under-serviced languages of the world. Many thanks to Udhay Nallasamy for bringing this site to my attention.
From the Spice Project website:
Speech technology potentially allows everyone to participate in today's information revolution and can bridge the language barrier gap. Unfortunately, construction of speech processing systems requires significant resources. With some 4500-6000 languages in the world, traditionally speech processing is prohibitive to all but the most economically viable languages. In spite of recent improvements in speech processing, supporting new languages is a skilled job requiring significant effort from trained individuals. This project overcomes both limitations by providing innovative methods and tools for [native] users to develop speech processing models, collect appropriate data to build these models, and evaluate the results allowing iterative improvements.
The steps you take are as follows:
1. IPA Phone Selection for your Language:
The website provides a tool which displays IPA phonemes. You choose the names to give to the phonemes you'd like your Speech Engine to have, for your language.
2. Janus Acoustic Model Creation for your Language:
Once you specify all your phones, you click the Acoustic Model tab to create an Acoustic Model for the Janus Speech Recognition Toolkit (Janus is a speech-to-speech translation research project). The system uses audio from the GlobalPhone and FestVox projects to generate the Acoustic Models. The FestVox audio is freely available (the Voxforge corpus actually includes some audio from the FestVox project), but it looks like the audio from the GlobalPhone Corpus might not be available (I could not find a way to download the audio on their main site).
3. Create Language Model for your Language:
Basically you provide a link to a web page, and the SPICE web spider crawls the site to gather words and phrases and calculate the probabilities required for a Language Model. You can also supply a text file to the site to generate a language model.
4. Dictionary for your Language
Once you have your Acoustic Model and Language Model completed, you then generate a Dictionary.
Once you have all these files, you can the use them with the Janus Speech Recognition Toolkit. I'm not familiar with the Janus Speech Recognition Engine, but it seems like the AM, LM and dict can then be used in applications for the translation of spontaneous speech.
This site provides a bit of history of the Janus toolkit, but I have not found a site to download the software yet. I'll keep looking, but if anyone finds it please let me know,
--- (Edited on 2/13/2007 10:42 am [GMT-0500] by kmaclean) ---
Got an email from Tanja Schultz, InterACT: Advanced Communication Technologies Language Technologies Institute, Carnegie Mellon University.
She says that Janus is NOT an open source toolkit.
--- (Edited on 3/ 1/2007 10:10 pm [GMT-0500] by kmaclean) ---