up-to-date review of ready-to-use linux speech recognition front-ends

Dialog Managers

Flat

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: JohanLingen
Date: 3/2/2008 8:47 am

Views: 1624
Rating: 27

Hi Ken,

also sorry for the delay :-)

I finally got to submit my speech (catched a cold a few times in row) and so did my girlfriend.

Here are a few remarks on the site:

1. it is not very clear from the Home page how to get to the dutch submission page.

2. could you make it clearer how many speech models one should submit? I guess the more the better, but there are two things that restrict one in this:

a) one is not invited to submit another 10 phrases

b) the prhases are picked in random order, from the submission page there is no overview in which you can select those phrases you haven't submitted yet.

Also to this respect: is it useful if one submits the same phrases multiple times?

3. It might encourage potential donors if there is any sight on the importance of their donation. Is there some educated guess about how many donations are needed to create a fairly accurate model? Maybe this could be displayed in some graph or so. Maybe this should even be on the front page for different languages.

I hope this is of help. I greatly appreciate this project and I hope there will be usable applications soon! (like simon, I think I will really like that application when it's 'finished').

If the importance is more clear, I might try to post some fora to get some speech donors!

God Bless,

Johan

PS: When I follow the instructions from the submission page and try to figure out how I can adjust my microphone volume level, I still only find instructions for Gnome, not for Kmix or windows. This might be a good idea to add!

--- (Edited on 3/2/2008 8:47 am [GMT-0600] by JohanLingen) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: JohanLingen
Date: 3/2/2008 11:28 am

Views: 1100
Rating: 16

And something I forgot to say: great job on the dutch version of the submission application!

Johan

--- (Edited on 3/2/2008 11:28 am [GMT-0600] by JohanLingen) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: kmaclean
Date: 3/2/2008 2:43 pm

Views: 204
Rating: 23

Hi Johan,

Thanks!

Although Robin is the one you should thank - he did all the Dutch translations. I just plugged them in to variables in the speech submission app.

Ken

--- (Edited on 3/2/2008 3:44 pm [GMT-0500] by kmaclean) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: kmaclean
Date: 3/2/2008 2:37 pm

Views: 228
Rating: 22

Hi Johan,

Thanks for your comments (and submissions :) ). My replies follow:

>1. it is not very clear from the Home page how to get to the dutch submission page.

Good point, created a new RFE (ticket 345) to track this request.

>2. could you make it clearer how many speech models one should submit?

I think at least 10-15 minutes of speech from each submitter would go a long way to helping create good acoustic models.

>I guess the more the better, but there are two things that restrict one in this:

> a) one is not invited to submit another 10 phrases

Actually, users are invited to do so ... the read page says the following:

Repeat the process (multiple submissions are encouraged!)

> b) the prhases are picked in random order, from the submission page

>there is no overview in which you can select those phrases you haven't

>submitted yet.

It was much easier to code the way it is curently set up. Java is not my primary coding language. It is more of an issue where there are fewer prompts.

>Also to this respect: is it useful if one submits the same phrases multiple times?

Yes, since a person never really says the same sentence exactly the same way twice. There are over 1200 English prompts and close to 200 Dutch prompts. However, adding more prompts would make the Speech Submission app a bit unwildely to download, so I will be looking at creating separate builds of the speech submission app for each language (see ticket #335).

>3. It might encourage potential donors if there is any sight on the

>importance of their donation. Is there some educated guess about how

>many donations are needed to create a fairly accurate model?

The Release 1.0 target for the English Speech Recogntion corpus is 140 hours (which is the number of hours used by the Sphinx WSJ corpus)

>Maybe this could be displayed in some graph or so. Maybe this should even

>be on the front page for different languages.

We have a metrics page ... for now. I will add a ticket for some sort of front-page display of goals/and metrics (see ticket 346)

>I hope this is of help.

yes very much, but remember this is a volunteer effort, so things might not get implemented as soon as we would like... :)

Hope that helps,

Ken

P.S. added another ticket for kmix instructions (ticket 344).

--- (Edited on 3/2/2008 3:37 pm [GMT-0500] by kmaclean) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: JohanLingen
Date: 3/2/2008 3:37 pm

Views: 1025
Rating: 19

Hi Ken,

thank you for your reply and your voluntary efforts. I realise I should have been a bit more specific in my post.

The text you cite does encourage donors to submit more speech, but this only follows from the text (and note that the dutch translation does not contain this invitation!).

After I read my first 10 phrases, I see a page displaying "Thank you!" and some information about unprocessed speech samples.

My suggestion is that this site should invite donors to submit again, BECAUSE (I composed a sample text for this site from the comments you gave; these are arguments for donors to submit again):

Sample text for webpage displayed after submitting speech:

"Please consider submitting another speech sample. VoxForge would ideally need at least 15 minutes of speech of each donor and a total of 140 hours for a first release of an acoustic model. To have good coverage of the language we made hundreds of different speech assignments. Don't worry if you already submitted an assignment, that also is of great use to us because you are never able to speak it out exactly the same way again!

Click here to submit again!

Below are the statistics for your submission and the total submission. Would you be so kind to donate at least another xx minutes?

And please... don't forget to tell your family and friends to become a speech donor!" (maybe even add an e-mail form to invite others)

(and of course it would be great if this page is also translated to all languages)

For 140 hours you only need 560 people to submit 15 minutes of speech. I think that with these suggestions that goal could be realised within months (apart from processing all these speech samples :) ).

PS: thank you, too, Robin!! I know already that many people are going to be very happy with the efforts you make right now!

--- (Edited on 3/2/2008 3:37 pm [GMT-0600] by JohanLingen) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: kmaclean
Date: 3/3/2008 8:13 am

Views: 221
Rating: 32

Hi Johan,

>The text you cite does encourage donors to submit more speech, but this only

>follows from the text (and note that the dutch translation does not contain this

>invitation!).

To be fair to Robin, I likely added this only after he had completed his translations, and never notified him of the update :)

>My suggestion is that this site should invite donors to submit again,

>BECAUSE (I composed a sample text for this site from the comments you

>gave; these are arguments for donors to submit again):

Aahh! Excellent! Most of this can be done fairly quickly/easily ... though I can't implement all of this immediately.

Thanks again for your valuable feedback!

Ken

--- (Edited on 3/3/2008 9:13 am [GMT-0500] by kmaclean) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: JohanLingen
Date: 3/3/2008 9:14 am

Views: 241
Rating: 27

> To be fair to Robin, I likely added this only after he had completed his translations, and never notified him of the update :)

Haha, caught you

>Aahh! Excellent! Most of this can be done fairly quickly/easily ... though I can't implement all of this immediately.

I am a total n00b on acoustic models or whatever these things are called, but please post again once the site workflow (or whatever you want to call it) is adjusted to reflect your wish that users submit more than once. Then I'll give it a try to mobilize the Netherlands to submit their voice (hope Robin will be ready for that, is he? ).

> Thanks again for your valuable feedback!

Of course, for the greater good! I really can't wait for good and free speech recognition tools (especially for linux). Do you have any idea how long it would take to implement such a thing after you have this 140 hours of speech? (or maybe you need 500 hours)

--- (Edited on 3/3/2008 9:14 am [GMT-0600] by JohanLingen) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: kmaclean
Date: 3/3/2008 8:59 am

Views: 232
Rating: 21

Hi Johan,

Here is what the endpage (the page that appears after a user submits speech) now looks like,

thanks for the help,

Ken

--- (Edited on 3/3/2008 9:59 am [GMT-0500] by kmaclean) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: JohanLingen
Date: 3/3/2008 10:11 am

Views: 1325
Rating: 22

It's getting really off-topic, but we're making great progress here :-)

I thought you might be interested in a dutch translation (would you be so kind to replace the links to point to the dutch pages).

[ENGLISH]

Thank you for your submission!

See below for a current list of speech submissions waiting to be incorporated into the VoxForge Corpus.

Your speech will be processed and incorporated into the VoxForge corpus in tonight's nightly build process. To keep track of how much you have submitted, and how close we are to our goal, see the speech submission metrics page.

Please consider donating another speech sample.

VoxForge would ideally like to have at least 10-15 minutes of speech from each donor so that we can meet our target of 140 hours for the first release of the VoxForge speech corpus and acoustic models. To ensure that we have good coverage of the language we have created hundreds of different speech prompts. Don't worry if you already submitted a particular set of prompts, that also is of great use to us since a person never says the same sentence exactly the same way twice!

Click here to submit again!

And please... don't forget to tell your family and friends about VoxForge and ask them to become a speech donor!

Thanks again from the VoxForge team

Free Speech... Recognition http://www.voxforge.org

[/ENGLISH]

[DUTCH]

"Bedankt voor je donatie!

Zie hieronder voor de huidige lijst met spraakdonaties die wachten op goedkeuring om te worden toegevoegd aan de VoxForge Corpus.

Je spraak wordt verwerkt en 's nachts toegevoegd aan de VoxForge corpus. Om te zien hoeveel spraak je heeft gedoneerd en hoe dicht we bij ons doel zijn, zie deze spraakdonatiemeter."

Zou je willen overwegen om nog een spraakmodel te doneren?

VoxForge zou in het meest ideale geval beschikken over 10-15 minuten spraak van iedere donor, zodat we ons doel van 140 uur spraak voor de eerste VoxForge spraakcorpus en akoustische modellen kunnen halen. Om ervoor te zorgen dat we een goede dekking hebben van de taal, hebben we honderden verschillende spraakopdrachten samengesteld. Het geeft niet als je een bepaalde verzameling spraakopdrachten al eerder hebt gedoneerd. Ook dat is voor ons van grote waarde, want niemand kan dezelfde zin twee keer exact op dezelfde manier inspreken!

Klik hier om nogmaals te doneren!

En... vergeet natuurlijk niet je familie en vrienden te vertellen over VoxForge en vraag hen om óók spraakdonor te worden!

Nogmaals bedankt namens het VoxForge team

Free Speech... Recognition http://www.voxforge.org

[/DUTCH] (the word joke is untranslatable so I don't translate it)

[ENGLISH]

Please Note:

1. It may take a few minutes for your submission to appear in the listing below. If your submission still does not appear, try clearing your browser cache.

2. Each submission is reviewed and then added to the VoxForge corpus. A link to your submission will be put in the appropriate "Speech Files" forum after it has been processed.

3. For security reasons, the files in the listing below are not downloadable.

[/ENGLISH]

[DUTCH]

N.B.:

1. Het kan een paar minuten duren voordat uw donatie in de lijst hieronder verschijnt. Als uw donatie daarna nog niet verschijnt, probeer dan de cache van uw browser te legen.

2. Iedere donatie wordt geëvalueerd en daarna toegevoegd aan de VoxForge corpus. Nadat de donatie is verwerkt, wordt er een link naar uw donatie geplaatst in het betreffende "Speech files" forum (kies "Listen" in het menu).

3. Om veiligheidsredenen zijn de bestanden hieronder niet te downloaden.

[/DUTCH]

PS: the metric stats look a bit weird as they are not completely alphabetic?

--- (Edited on 3/3/2008 10:11 am [GMT-0600] by JohanLingen) ---

Re: up-to-date review of ready-to-use linux speech recognition front-ends

User: kmaclean
Date: 3/4/2008 1:18 pm

Views: 289
Rating: 23

Hi Johan,

Thanks for the translations! I have to take a look at the Speech Submission app to make sure I can point to a different endpage - it might be easy to implement, or it might have to wait for the next release of the Speech Submisson app (when I implement Italian and fix the Russian translations).

> the metric stats look a bit weird as they are not completely alphabetic?

I use the default Perl sort and it treats caps and lowercase as different characters with different sort values (created ticket 352).

Ken

--- (Edited on 3/4/2008 2:18 pm [GMT-0500] by kmaclean) ---

[ «Previous Page | 1 2 3 | Next Page» ]

Previous • Next •


Username	Password