Simple command & control app - general question
User: ssp
Date: 4/25/2017 9:30 am
Views: 2996
Rating: 0


I'm trying to build a command and control app with Sphinx4 which should recognize numbers and yes/no (for many people, not only for me). Before I started with the project, I was convinced that this task should be not that hard. At present, I achieve an accuracy of ~ 70 % (for my voice only). I think that is for 12 words very bad.

Could someone explain me what I can actually expect? Ok, they mentioned in the wiki that one cannot expect a great accuracy, but does that apply for my small use case, too?

Where exactly is the bottle neck?

Maybe someone with more experience could give me some hints?

I already posted a more specific question one StackOverflow [1] and the suggestions were helpful, but overall not what I was hoping for.

Where is the point of 200 hours of speech, when the model fails for such simple cases? Again, maybe I have the wrong expections, because I haven't worked in the field of speech recognition yet.

I would be very grateful for some enlightening words :)