It seems to me this project might be a great vehicle to help user groups who are usually ignored by commercial vendors. Here I am thinking of people who speak uncommon (or unprofitable) language and dialects, and also of people with speech impairments such as dysarthric speech (http://www.cita.uiuc.edu/research/asr/abstract.php).
The long term plan is definitely to branch out to other
languages. We need to get some successes under our belt before we
do this, so English is the focus for now.
Remember, hundreds of hours of speech are required to create Speaker Independent Acoustic Models. You can create reasonably good Speaker Dependent Acoustic
Models for desktop command and control type applications (not
dictation) with much less training by using a 'so-so' Speaker
Independent Acoustic Model and adapting it using 20-30 minutes
worth of speech audio (HTK says as little as 30-40 sentences, but more
is better) - but a model can only be adapted to the same language it
was trained with (there is some research where they are working
adapting to other languages, but my recollection was that they were not
In the case of uncommon languages, the issue will likely be to
get enough people to submit the required speech data. But time in
on our side, and if we collect good quality speech, then sooner or
later, we should have enough audio to create decent Acoustic
Models. Then the problem becomes how to model the language -
which is a linguistics/technical issue that can be addressed when the
With respect to people with speech impairments, we would need
some research into understanding whether there are enough commonalities
between speakers that will permit the creation of Speaker Independent Acoustic Models. My sense here is that the impairments may be so specific to the individual that only Speaker Dependent
Models would be accurate enough. That is not to say we can't help
once we have stable English Acoustic models - we would need a test case
to work out the details.
I've seen specific reference to work being done on dysarthric speech by Turner Rentz in this submission to OSSRI, you might want to contact him for details on their approach to see if we can implement them here.
Ken--- (Edited on 10/12/2006 12:35 am [GMT-0400] by kmaclean) ---
Thanks for your reply. I think your idea of outreach to researchers in the area of recognizing impaired speech is a very good one. I won't be involving myself, but I appreciate your invitation.
--- (Edited on 10/12/2006 12:18 pm [GMT-0500] by Visitor) ---