Acoustic Model Discussions

Flat
Re: Packing of the audio files
User: kmaclean
Date: 2/14/2010 9:16 pm
Views: 92
Rating: 3

> I just get 403 Error.

Oops, that's permissions related... Which svn instance are you trying to access?

--- (Edited on 2/14/2010 10:16 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/14/2010 9:59 pm
Views: 100
Rating: 0

OK, from the httpd logs it looks like you want to put your Sphinx acoustic models in:

http://www.dev.voxforge.org/svn/SpeechCorpus/Trunk/AcousticModels/Sphinx/

Which makes sense since the Sphinx link on the Downloads page points to:

http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/AcousticModels/Sphinx/

However, the best place to put them would be in:

http://www.dev.voxforge.org/svn/Main/Trunk/AcousticModels/Sphinx/

And I will correct the link on the Downloads page.

You don't want to be committing to (or checking out from) the SpeechCorpus svn repository instance since it is so big...

Ken

--- (Edited on 2/14/2010 10:59 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: nsh
Date: 2/15/2010 7:20 pm
Views: 128
Rating: 0

> However, the best place to put them would be

> in:http://www.dev.voxforge.org/svn/Main/Trunk

> /AcousticModels/Sphinx/

      > And I will correct the link on the Downloads page.


Ok, this is done

    > You don't want to be committing to (or checking out from) the SpeechCorpus svn repository instance since it is so big.

Ok, but someone except me should correct a lot of prompts mistakes. Otherwise about 10 hours from 70 get unusable. There is no problem to checkout just etc, it just needs to be accessible.

As for size, there is definitely no need to store the MFCC as well as converted files in SVN. They are derived work and only create problems with syncronization.


Huge thanks Ken :)

 

--- (Edited on 2/16/2010 04:25 [GMT+0300] by nsh) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/18/2010 10:20 pm
Views: 85
Rating: 0

>Ok, but someone except me should correct a lot of prompts mistakes.

>Otherwise about 10 hours from 70 get unusable. There is no problem to

>checkout just etc, it just needs to be accessible.

Are these the ones listed in the PROBLEMS file of the Sphinx acoustic model that you just created? 

I just listened to al the prompts in the first one (atterer-01202007-a) and it seems to be OK...

already fixed in ticket 21.

Ken

 

--- (Edited on 2/18/2010 11:20 pm [GMT-0500] by kmaclean) ---

--- (Edited on 2/24/2010 10:35 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/19/2010 1:26 pm
Views: 190
Rating: 0

OK, I've been looking at forced alignment issues listed in the PROBLEMS file of your latest Sphinx acoustic model...

The approach I have been taking in processing speech for the VoxForge corpus (so far) is to include a submission (English) in the corpus, unless it has really bad audio quality.

For those with marginal quality (some concerns, but not enough to exclude them), I include them in the corpus, and then remove them from the master_prompts files.  I also manually note in the Readme for such a submission, under the "Quality" heading, the type of problem that might concern me (line noise, non-speech noise, line hiss, audio clipping,...).

For example, the "mjmm-20080526-hca" submission has Quality description that says:  "extreme line noise", and for this reason I did not include it in the Master Prompts file.   See this file for a list of all the submissions with forced alignment issues.

This differs from your approach to use the prompts files in each submission for the creation the new Sphinx acoustic model (and explains why I have not gotten around to fixing the prompts files in the submissions until now...).

Possible fix:

Each submission has a prompts-orginal file, which contains the prompts in their unaltered format, and a PROMPTS file, which has had some cleanup done, and all words are captialized. 

A. Ignore all those submission with anything in their 'Quality' field.

B. add a "Clean_speech" field/Tag to all ReadMe files that have clean speech.

C. If a submission has problem prompts, then remove them from the PROMPTS file.  That way, if someone needs noisy speech, they can still get a the prompt (in original-prompts), but any script would encounter a blank PROMPTS file, would just skip the contents and not include those prompts for the creation of acoustic models.

Which these (or any other approach), from your standpoint, should we use?

thanks,

Ken

--- (Edited on 2/19/2010 2:26 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/24/2010 9:30 pm
Views: 88
Rating: 1

>I just listened to al the prompts in the first one (atterer-01202007-a) and it

>seems to be OK...

Oops, brain fart... I already fixed them in Ticket #21

--- (Edited on 2/24/2010 10:30 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/24/2010 9:59 pm
Views: 1366
Rating: 0

>Ok, but someone except me should correct a lot of prompts mistakes.

>Otherwise about 10 hours from 70 get unusable. There is no problem to

>checkout just etc, it just needs to be accessible.

like those listed in this thread (summarized in ticket #376)?

--- (Edited on 2/24/2010 10:59 pm [GMT-0500] by kmaclean) ---

--- (Edited on 2/24/2010 11:01 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/18/2010 10:21 pm
Views: 77
Rating: 1

>As for size, there is definitely no need to store the MFCC as well as

>converted files in SVN. They are derived work and only create problems

>with syncronization.

Agree.

Design error on my part.

Ken

--- (Edited on 2/18/2010 11:21 pm [GMT-0500] by kmaclean) ---

Re: Packing of the audio files
User: kmaclean
Date: 2/18/2010 10:00 pm
Views: 64
Rating: 0

>For script simplicity and consistency it would be nice to convert them to

>upper case. I wanted to do it myself but gave up to checkout few gigs

>from svn.

Fixed.

See ticket #22 for details.

--- (Edited on 2/18/2010 11:00 pm [GMT-0500] by kmaclean) ---

PreviousNext