Just a thought:
I was reading the information about speaker dependent and speaker independent models on:
http://www.voxforge.org/home/dev
and it occurred to me that people who want to train the model to better recognise their voices are prime donators. If an interface collects the necessary samples to train the model to an individual's voice, the hard part is already done and a large number would likely submit the samples if asked.
I realise that this isn't immediately useful, but in the future, the idea is that speech-recognition/desktop-control applications will be derived from this project. A person installing a speech-recognition program is likely to expect to spend a decent amount of time (10 minutes? 30?) training it to their voice. It would be worth keeping in mind that we want to collect the raw audio in a useful format and ask the user to submit that to Voxforge
--- (Edited on 4/10/2008 4:33 am [GMT-0500] by Luna-Tick) ---
Well, everything is correct. Moreover with a 30 minutes of user's speech it's better to adapt generic model to his voice and get perfect user-dependant model. Such service indeed is very perspective and we can do it right now. The only trouble is the processing resources for model's adaptation.
--- (Edited on 4/10/2008 11:54 pm [GMT-0500] by nsh) ---
Hi Luna-Tick,
Excellent point - this is kind of the "Holy Grail" (Monty Python's version) of what we are trying to do.
We need to give users something that they need (i.e. speech recognition), allow them to tailor it to their environment (adapt general acoustic models to make them work better with their voice), and if they want to give back to the community, allow them to easily upload their adaptation speech recordings to VoxForge, or any other similar community.
Ken
--- (Edited on 4/15/2008 1:06 pm [GMT-0400] by kmaclean) ---