Italian

Nested
Re: Italian Language
User: kmaclean
Date: 7/30/2007 1:20 pm
Views: 462
Rating: 37

>>>First of all you need a large collection of Italian texts (100 Mb for example). Do you have such a big collection?

>>What do you mean exactly for italian texts?

>In theory they should be free but copyrighted texts are also acceptable.

Two points to discuss here:

1. Distinction between dictation-based speech recognition and command and control or IVR speech recognition

For dictation applications you need to create a language model and an acoustic model.  Language models require very large amounts of text (the 100 Mb that nsh was referring to for example).

For command and control or IVR apps, you use a grammar file (rather than a language model) and an acoustic model.  You can have as few as 5 - 10  words in your grammar, but your system will only recognize those words.   You do *not* need a large amount of text to do this type of speech recognition.  This is the type of system that the VoxForge Tutorial helps you create.

This is based on my knowledge of HTK/Julius.  Sphinx might do (and likely does ...)  things differently.

2.  Use of Copyrighted texts

Although we have not (yet ...) created a VoxForge language model, which would require 100Mb++ in texts (which would be used to find the probabilities of occurrence of words in different contexts), we do use public domain texts for our prompts - i.e. the prompts that tell users what to say when they want to submit their speech to VoxForge.

In this particular case,  VoxForge tries to ensure that we only use non-copyrighted texts (or copyrighted texts with permissible licenses).  We want to avoid a situation where we might be required to remove any speech from our corpus because of Copyright issues.

Creating a recording creates a "derivative work" of the original Copyrighted work.  Therefore, if the text you are recording is still covered under Copyright, the Copyright holder retains rights to any "derived" works (in this case the recording you made of their work), and can prevent you from making copies or distributing such derived works.

In addition, the original Copyright holder's rights might apply any acoustic models created from speech recordings made from the reading of their Copyrighted work - since these might be considered a derivative work. 

Copyright might not apply to the text used in the creation of language models since all you are doing is creating a list of the probabilities of the words in different contexts.  However, if you want to include the source text with your language model (as the use of a GPL license would require), and distribute it, then you would be limited to using out-of-copyright texts.

Only a court can say for certain one way or another, so the approach we are taking at VoxForge is a conservative one, and thus we try to avoid using copyrighted works.

There are other options:

  • you can still create your own texts, and assign them to the public domain, (this would make sense for the creation of new prompts), or

  • go to the Project Gutenberg site and use some of the out-of-copyright texts they have on their Italian page;

Having said all this, if you don't plan to distribute your texts, then there is not much a Copyright holder can do to stop you.  

>it seems easier to train sphinx model than htk one,

I don't really know ... Sphinx has come a long way since I last looked at Acoustic Model creation with them.

Regardless, since you will have the source speech audio, you will always have the option of creating HTK acoustic models at a later date.

hope this helps, 

Ken 





Re: Italian Language
User: nsh
Date: 7/30/2007 5:57 pm
Views: 522
Rating: 35

Well, I created simple Italian structure for sphinxtrain. You can download it from:

http://nshmyrev.narod.ru/temp/voxforge_it_sphinx.tar.gz 

The further work is very easy: 

1. Put more wave files in wav subdir

2. Update fileids

3. Update transcription

4. Update dictionary if required

5. Run ./scripts.pl/make_feats -ctl etc/voxforge_it_train.fileids

6. Run ./scripts.pl/RunAll.pl 

Re: Italian Language
User: Manuel
Date: 8/1/2007 10:03 am
Views: 450
Rating: 33

Hi, I'm trying to install Sphinx 3, to use your italian acoustic model, but when I start PERL script decode/slave.pl, it give me an error, because it can't find script_pl/util/utils.pl

This file there isn't in my directory, can you help me?

Tks

Manuel 

Re: Italian Language
User: nsh
Date: 8/1/2007 11:11 am
Views: 373
Rating: 30

Hm, what version are you using exactly? I prefer latest one from nightly build or from svn checkout.

Where did you get that slave.pl, it's not a part of archive. Actually model testing is not so easy right now partially because of unsufficient data partially because there is no language model yet since we have no text collection.

It's possible to test acoustic model with finite state grammar actually but it's much better to build language model first.

 

Re: Italian Language
User: nsh
Date: 8/1/2007 11:58 am
Views: 413
Rating: 29

Btw, I've found a site where you can download texts:

  http://www.liberliber.it/biblioteca/

Re: Italian Language
User: nsh
Date: 8/2/2007 4:09 pm
Views: 373
Rating: 31

Well, I've updated scripts with a language model and updated prompts for full phoneset. Now you can test recognition, it even has non-zero WER but of course quality is _very_ poor.

New files are available at:

 http://www.gigasize.com/get.php/-1099746482/voxforge_it_sphinx.tar.gz

Probably there is sense to commit them to voxforge and start audio update. We can create the similar bootstrap structure for spanish, french, welsh, dutch, polish and czech btw. 

Re: Italian Language
User: kmaclean
Date: 8/3/2007 4:22 pm
Views: 431
Rating: 30

hi nsh,

thanks, this is great!

I will look at how we should set up the infrastructure for Italian and the other languages when I get back from vacation (around Aug 11).

Ken

Re: Italian Language
User: kmaclean
Date: 8/15/2007 8:10 pm
Views: 576
Rating: 28

Hi nsh,

I've created a dev site for Italian at:

http://www.dev.voxforge.org/projects/it

http://www.dev.voxforge.org/svn/it

I've also uploaded the Sphinx Acoustic Model creation scripts to the svn trunk for this repository.

If you create the same for the other languages you mentioned, I'll create repositories for the remaining languages you mentioned: spanish, french, welsh, polish and czech.  Dutch and English already have repositories created, but whenever you have a chance, Sphinx AM creation scripts for those too would be greatly appreciated.

thanks, 

Ken 

 

 

Re: Italian Language
User: Manuel
Date: 9/9/2007 8:39 am
Views: 386
Rating: 26

I've not found any phoneme list. 

I dowloaded Festival, but where can I found the list of phoneme for the italian language?

Tks

Manuel

 

Re: Italian Language
User: kmaclean
Date: 9/9/2007 3:44 pm
Views: 534
Rating: 31

Hi Manuel,

The phone list used in Festival is here.

The Italian dictionary is here.

Ken 

Previous