German

Nested
How can I help most efficiently? (Beginner)
User: Manu
Date: 3/18/2014 7:46 am
Views: 6624
Rating: 0

Hi fellow froum members

I just started to experiment with speech regognition and i find this project a very good idea.


My personal goal is to use my linux pc with blather, i was very impressed by the english demo here: https://www.youtube.com/watch?v=gr1FZ2F7KYA and the german voice model. But im a bit confused about all the threades here in the forum so i will ask my beginner questions here.


1. I understand that blather does use sphinx and sphinx can use the german langauge model here. Is there somewhere a tutorial how i would set up sphinx with the german language model in blather. I found some things on the forum but it does not cover the step how to integreate the new language into blather and it seems quite complicated too.

Is there somewhere a step by step howto how to achive this ?

2. Is the german model even completed for that intended use or is it so basic that even voice commands would not work properly?

3. I also want to help to improve the german model and speech files etc.

How can i do this the most efficient way ?

As far as i understand it the process is as follows:

- Speech gets recorded (Human)

- Speech is listend to and with some criteria tagged (Human)

- The files will be analysed and a speech model is computed. (PC) - a dict file and prounciantion file is created out of it.

- This files can be used in sphinx

Edit: I subbmited some recorded text with the java app. Will continue to do so. When/how gets this availiable for the german lang model ?


Where could i most efficiently help here ?


Thanks for your time :-D

Manu

Re: How can I help most efficiently? (Beginner)
User: nsh
Date: 3/19/2014 7:22 am
Views: 81
Rating: 0

> I found some things on the forum but it does not cover the step how to integreate the new language into blather and it seems quite complicated too.

Integration is pretty simple - like blather README describes you just need to put required files in ~/.config/blather. You can download the lately trained german model and dictionary here:

http://www.voxforge.org/home/forums/other-languages/german/updated-12k-word-16khz-sphinx-model-available

You can create the language model with offline tools or with web service. All you need to do is to put files into required folder.

 

> 2. Is the german model even completed for that intended use or is it so basic that even voice commands would not work properly?

German model is pretty complete, it should work for you

>  I also want to help to improve the german model and speech files etc. How can i do this the most efficient way ?

It depends on your skills and wishes. If you have software development skills in Linux, C and Java, there are number of CMUSphinx programming tasks you can handle. If you don't have thouse, you can help by finding transcribed podcasts in German. We need a lot of them, probably 100 or 200 hours of transcribed podcasts by many speakers
If you are not willing to do above, you can record some audio too, that would be helpful.

> Edit: I subbmited some recorded text with the java app. Will continue to do so. When/how gets this availiable for the german lang model ?

Since process is manual now, it will take some time to integrate your speech. You shouldn't worry about it overall, the model should be quite good already.
Re: How can I help most efficiently? (Beginner)
User: Manu
Date: 3/19/2014 1:47 pm
Views: 71
Rating: 0

>Integration is pretty simple - like blather README describes you >just need to put required files in ~/.config/blather. You can >download the lately trained german model and dictionary here:

>http://www.voxforge.org/home/forums/other-languages/german/updated-12k-word-16khz-sphinx-model-available

>You can create the language model with offline tools or with web >service. All you need to do is to put files into required folder.

Thank you for answering :-D


Hmm the instructions on here are quite hard to understand for me: https://github.com/gooofy/voxforge/blob/master/README.md

Like do i need to all the steps ? etc.

For now i try to mimic the behavior i do with the english language files: i downloaded the german sphinx model and extraced the german dict. Then i converted the lm.DPA into lm.arpa format, as blather seems to use that. I did not create/compiled anything as the two files seem to be the finished language model. I'm correct here ?

However i did not yet test this approach/files.

If in not correct and i indeed will have to make the language model, i think i have to do these steps(just the headlines):

Dictionary

Compute Sphinx Model

Running pocketsphinx (for test)

If there is something missing, please be so kind and fill me in.

>It depends on your skills and wishes. If you have software >development skills in Linux, C and Java, there are number of >CMUSphinx programming tasks you can handle. If you don't >have thouse, you can help by finding transcribed podcasts in >German. We need a lot of them, probably 100 or 200 hours of >transcribed podcasts by many speakers

>If you are not willing to do above, you can record some audio >too, that would be helpful.

Well I got some skills in Linux but nothing in C or Java :-(

I will try my luck with the podcasts. Do i need to look for special licences or something ? Or can i just send in what ppl would send me ? Do they need to make the podcast public domain so you could use them ?

I got another idea, we could write up some fancy text with all the benefits of speech regognition for eg. the disabled communit y for instance and make a petittion in change.org so this project gets more widespread attention, even if every 10. person who would have seen the pettion add one voice bundle it would great extend the varity of speech in the model.


Also i think i will upload some spoken text from a book. Do you have any recommendations like a book who has many words who are not yet properly detected or something.


I was in addition thinking, that i use KDE and there are a lot of special words, like kwrite, kiten, ktorrent, etc. Has anybody yet spoken these words or even PC words like Root, /etc etc. Or would it be helpful to upload a bunch of these ?


Thanks

Re: How can I help most efficiently? (Beginner)
User: nsh
Date: 3/19/2014 2:09 pm
Views: 49
Rating: 0

Hmm the instructions on here are quite hard to understand for me: https://github.com/gooofy/voxforge/blob/master/README.md  Like do i need to all the steps ? etc.

No, you do not need those steps
For now i try to mimic the behavior i do with the english language files: i downloaded the german sphinx model and extraced the german dict. Then i converted the lm.DPA into lm.arpa format, as blather seems to use that. I did not create/compiled anything as the two files seem to be the finished language model. I'm correct here ?
You need to build language model with a service as blather howto says, and not to convert it from dmp file. Otherwise you are correct.

> I will try my luck with the podcasts. Do i need to look for special licences or something ?

No

> Or can i just send in what ppl would send me ?

People will not send you anything

> Do they need to make the podcast public domain so you could use them ?

Doesn't matter, but why not.

 

> I got another idea, we could write up some fancy text with all the benefits of speech regognition for eg. the disabled communit y for instance and make a petittion in change.org so this project gets more widespread attention, even if every 10. person who would have seen the pettion add one voice bundle it would great extend the varity of speech in the model.

This is a great idea

> Also i think i will upload some spoken text from a book. Do you have any recommendations like a book who has many words who are not yet properly detected or something.

It doesn't matter if words are properly detected or not.

> I was in addition thinking, that i use KDE and there are a lot of special words, like kwrite, kiten, ktorrent, etc. Has anybody yet spoken these words or even PC words like Root, /etc etc. Or would it be helpful to upload a bunch of these ?

In KDE there is a native speech recognition tool simon http://simon.kde.org/, you should probably try it instead, it has everything included and it's much more avanced than blanther.

Re: How can I help most efficiently? (Beginner)
User: Manu
Date: 3/19/2014 2:43 pm
Views: 78
Rating: 0

Hi
Thanks for answering so quickly:-D

I thought blather would use sphix for that. When i use blather in english, i just upload the signal words to the lmtool and i get a dict and a lm file back. The lm file looks like the one i create from the DMP file.

However how do i create a language model then ?

I will try out simon :-D But i also like the "hand made" aproach with blather :-D

As for the petion i dont really know all the ares in wich speech recognition could be helpful, so im not sure if i can help much.

A question out of couriosity: Why does it not matter if the words are properly detected ? I thought it would slpit them up in phonons and then match them agains a dictionary.

Thanks a lot :-D

 

Re: How can I help most efficiently? (Beginner)
User: nsh
Date: 3/22/2014 3:32 pm
Views: 2622
Rating: 0

> I thought blather would use sphix for that. When i use blather in english, i just upload the signal words to the lmtool and i get a dict and a lm file back. The lm file looks like the one i create from the DMP file.

You can read tutorial on that:
http://cmusphinx.sourceforge.net/wiki/tutoriallm
language model is created from sample sentences with web service or a command-line tool

> A question out of couriosity: Why does it not matter if the words are properly detected ? I thought it would slpit them up in phonons and then match them agains a dictionary.

No, it's not like that. You can read about the process in the tutorial too:
http://cmusphinx.sourceforge.net/wiki/tutorial

 

PreviousNext