I see that there is not much action in this sub-forum.
Is anyone working on Greek language support these days? What is the status and what is missing in general (besides tons more speech of course).
E.g., are we missing any training scripts, auxilliary files like phonemes list, pronunciation dictionary etc ?
In short, how could I help the project go forward (preferably without spending a huge amount of time).
Hello! Today, I created a Greek speech model with more than 1000 words. You can download this speech model, and use it with Simon 0.3.80. Unfortunately, you have to compile Simon 0.3.80 from source (git clone git://anongit.kde.org/simon simonsource). This speech model won't work with Simon 0.3.
I trained this speech model with my own voice (German accent). It is working well when I dictate with my specific German accent. A native speaker probably will get bad results from my speech model. But at least, you can expect some results.
Thanks! Also for the very detailed instructions / description of the steps you followed, in your blog :)
I 'm really pressed for time right now but I will try your model and also to follow the steps you describe (now using my voice) this weekend and get back to you :)
Btw, do you think the prompts should be somehow optimized for diphone/triphone coverage? I 've seen an algorithm somewhere that proposes a "quick & dirty" way to do this in festvox IIRC.
I don't think the texts would be a problem as for prompts it is easy to just create a few new ones myself (therefore no copyright issues).
I 'm also puzzled about how to go for building a corpus to use for a language model. It would seem that for Greek language a hybrid rules + statistics approach would work well, but I would need to read a few papers on how this could be implemented.
Finally, since it seems that the prompts in the applet still don't have the latest updates would it be better to just copy the applet and build our own site for Greek (so that we can iterate faster on prompt selection and approvals) and submit the generated speech "wholesale" here, or is it better to use this site and send patches / ask for commit access to the applet / scripts?
Finally, syllable-based speech recognition seems to work well for some languages / training sets and my intuition is that this approach could also work well for modern Greek. Are there any OSS systems using this approach?
(I realize this should probably be 3 threads, you will have to excuse my enthusiasm) :)
do you think the prompts should be somehow optimized for diphone/triphone coverage?
I am not sure. Maybe it is a good decision to cover frequent Greek words.
for Greek language a hybrid rules + statistics approach would work well
This sounds very theoretical.
copy the applet and build our own site for Greek
Personally, I wouldn't copy the applet (too much work). But building your own site for Greek might be useful.
syllable-based speech recognition seems to work well for some languages
I have never heard of that. I think that the triphone approach is a good one - should work for a lot of languages.
My opinion: You don't need that much theory to make it work. Without much theory, I was able to build a 36.000 words German speech model, and a 16.000 words General American speech model. I suggest that you build your own Greek speech model, and publish the result here at Voxforge (or at your own future Greek website).