General Discussion

Nested
Dynamic phoneme generation
User: SpecialA10
Date: 8/24/2009 2:51 pm
Views: 6024
Rating: 2

Hi,

In an application I am building I need to dynamically generate grammars (for Julius/Julian) with a set of words that are only known at runtime. There is a large percentage chance that several of these words won't be in any existing dictionary, so I will need to create the phoneme representation on the fly. Are there any tools available that can do this and will work well with the Voxforge AM/Julius? I found this tool online which seems to be exactly what I'm looking for, but I need an offline version:

http://www.speech.cs.cmu.edu/tools/lextool.html

I just started with asr/Voxforge/julius yesterday, so if I am missing a key concept, do tell me!

Thanks!

-Avery

--- (Edited on 8/24/2009 2:51 pm [GMT-0500] by SpecialA10) ---

Re: Dynamic phoneme generation
User: ralfherzog
Date: 8/25/2009 2:44 pm
Views: 114
Rating: 1

Hi!

"these words won't be in any existing dictionary"

Maybe Sequitur G2P is something for you. It makes use of the expectation-maximization (EM) algorithm (= key concept).

By the way, thanks for the link to the lexicon tool. I think that I can use this tool.

I would like to use Sequitur G2P for my own purposes because probably it is a good peace of software for phoneme creation.

 

--- (Edited on 2009-08-25 2:44 pm [GMT-0500] by ralfherzog) ---

Re: Dynamic phoneme generation
User: paradocs
Date: 8/25/2009 9:47 pm
Views: 1944
Rating: 4

Greetings,

In  addition to using Sequitur G2P,
the text to speech engines can
give a guess at phonemes.
Festival is best for English while espeak
may be of better help in other languages.

espeak -x Sequitur
 s'EkwItS3

festival
festival> (lex.lookup "Sequitur")
("sequitur" nil (((s eh k) 1) ((w ih t) 0) ((er) 0)))

For the CMU-40 reduced phoneme set
change the ax phoneme to ah.

A script automates this and produces phonemes.
I use a scripts to machine generate phonemes
and then edit the phonemes by hand:
http://g2p4j.sourceforge.net/

Of course this in only a starting place
to get to a dynamically generated grammar.

Best Wishes
paradocs

--- (Edited on 8/25/2009 9:47 pm [GMT-0500] by paradocs) ---

For words not in a dictionary this bash script works with festival

giving CMU-40 format.

wget http://g2p4j.sourceforge.net/sounditout

chmod 777 sounditout

use:

sounditout kweezlebotter

KWEEZLEBOTTER  K W IY Z AH L B AA T ER

--- (Edited on 8/28/2009 4:23 am [GMT-0500] by paradocs) ---

PreviousNext