German

Flat
Txt2Pho
User: kmaclean
Date: 4/16/2008 9:33 pm
Views: 24370
Rating: 30

This might be useful for the creation of a German pronunication dictionary: 

TXT2PHO - a TTS front end for the German inventories of the MBROLA project.

However, the software has some restrictive licensing provisions:

Permission is granted to use this software for non-commercial, non-military purposes, with and only with the lexicon and prosody files made available by the author from the HADIFIX for MBROLA project ...

Not sure if that would apply to pronunciations generated with the toolkit. 

Ken 

Re: Txt2Pho
User: timobaumann
Date: 4/18/2008 10:03 am
Views: 3076
Rating: 30

I don't think we can use it.
Using TXT2PHO in order to create a dictionary is close to reading the dictionary it uses (BOMP) directly. And both the dictionary and TXT2PHO itself clearly state they are non-military, which the GPL -- unfortunately -- is not.

Anyway, if we could use it, then we could just as well use BOMP directly.

I've had a first look at Sequitur G2P (which is a trainable g2p-tool) and it's likely that I will be allowed to use another trainable g2p-tool (without name, published in [1]). Thus, I will be able to compare the two and see which performs better. 

So, we need some data to bootstrap these trainable systems. I just checked in some tools that extract pronunciations from the German Wiktionary.

The resulting data has to be post-processed, before we can use it for bootstrapping. In order to priorize that, we could use the word frequency information from Wortschatz-project, for which a Perl-module (EDIT: newer version with fixed frequency extraction) is available.

I hope to be able to setup a webtool that helps to post-process the wiktionary output. Would there be anyone volunteering to actually use that webtool and help in creating the dictionary? Ralf, would you be willing (and able) to help?

Cheers!
Timo

 

[1]: Phonological Constraints and Morphological Preprocessing for Grapheme-to-phoneme Conversion
Vera Demberg, Helmut Schmid and Gregor Möhler, 2007
In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07), Prague, Czech Republic, June 2007

Re: Txt2Pho
User: kmaclean
Date: 4/18/2008 11:25 am
Views: 297
Rating: 27

Hi Timo,

Good work!

thanks,

Ken 

help to create the German dictionary
User: ralfherzog
Date: 4/20/2008 12:03 am
Views: 452
Rating: 29
Hello Timo, I would use this web tool, and help you with the creation of the dictionary.  Greetings, Ralf
Re: help to create the German dictionary
User: Visitor
Date: 5/14/2008 4:12 pm
Views: 253
Rating: 26

Hi Ralf,

sorry for not getting back to you any earlier.

I've set up a dictionary tool on http://www.ling.uni-potsdam.de/~timo/projekte/voxforge.html . The main task is to paste the entries in the first row on the right (Aussprachen) to the corresponding field on the left.

Now, if it was just that, it would be too easy and too boring...

Often, there are far more variants of the word on the left than there are transcriptions. In these cases it would be nice, if you could add the missing transcriptions (often it is just a matter of appending -? or -? or whatever.

Sometimes the list on the left contains ridiculous word forms -- just leave the corresponding field empty (or press "Wort entfernen", but the result will be the same). It may also happen, that you are asked for the same word more than once (there are different entries for "bin", "ist" "sind" in the wiktionary and each entry will ask about all different sein-forms). If you are sure you've entered a transcription already, then just ignore it the second time.

Sometimes there are actually more transcribed word forms than words on the left. (Or they are different.) Then you can add a word form on the left with "Wort hinzufügen". Note: Often there are different transcriptions for the same word form (?v?ltn?, ?v?lt?n). Usually you would want to pick the form that would be used most in colloquial speech (here: v?ltn?).

Also, there may just be erroneous transcriptions (quite often), where people just guessed how IPA works. It's important, that we catch most of these errors. So you might actually want to start out with the Wiktionary Transcription Guideline which shows, how the transcription *should* be.

To enter IPA symbols into the textfields directly, just type the keys listed on the right (for ? type N) and they will automagically be transformed to IPA. (This works in Firefox, I don't have Windows, so I can't check Internet Explorer.)

Please input your e-mail address or another kind of ID into the first textfield. This way we can later compare who's the most hard working transcriber!

Cheers, Timo

Re: help to create the German dictionary
User: timobaumann
Date: 5/14/2008 4:14 pm
Views: 477
Rating: 25

clickable link: http://www.ling.uni-potsdam.de/~timo/projekte/voxforge.html

UPDATE: It's important that you transcribe, how something would be spoken in colloquial standard German. By the way, what region of Germany are you from? ;-)

German dictionary acquisition project
User: ralfherzog
Date: 5/28/2008 2:09 am
Views: 307
Rating: 19
Hello Timo,

Congratulations for the great work that you have done.  I have just submitted my first entry for the dictionary acquisition project.  It was the German word "Vater."

I would say that I speak standard German.  I live in Bonn, but I don't speak "Bönnsch" or "Kölsch."

I am often online at the IRC-chat #cmusphinx.  Feel free to join.

Greetings, Ralf
retroflex nasal; dictionary acquisition project crashes
User: ralfherzog
Date: 5/30/2008 5:08 pm
Views: 365
Rating: 19
Hello Timo,

Here are three additional remarks.

1. In the transcription key of the dictionary acquisition project, there is the last entry "=      n?".  What is the meaning of this sign?  I assume that it indicates the retroflex nasal (IPA-number 117; Unicode: U+006E (n), U+0329).  But do we need this sign in the German language? I can't find this sign in your proposition for the German phone set.
In the entry for the German word "festesten" (superlative from "fest"), it is indicated that this word has the speech sound "'f?st?stn?".  Are you sure that we need the retroflex nasal? I would prefer "f?st?stn".

2. Sometimes the dictionary acquisition project crashes.  This happened yesterday and today at my computer (Win XP, Firefox; both actual version).  So I had to restart the Firefox browser again.  Obviously, the dictionary acquisition project doesn't like it when I input signs that are not allowed.

3.  Do you plan to release the results coming from the dictionary acquisition project under the Pronunciation Lexicon Specification?

Greetings, Ralf
Re: retroflex nasal; dictionary acquisition project crashes
User: kmaclean
Date: 5/31/2008 12:47 pm
Views: 432
Rating: 73

Hi Ralph,

>3.  Do you plan to release the results coming from the dictionary acquisition

>project under the Pronunciation Lexicon Specification

You certainly are persistent with respect to PLS   :)

Ken 

Your goals: HTK, Sphinx, Julius. My goals: PLS, SSML
User: ralfherzog
Date: 5/31/2008 10:56 pm
Views: 317
Rating: 22
Hello Ken,

Persistent - that is true.  But I do know that you have different priorities.  XML related standards are nearly everywhere.  For example, OpenOffice.org uses the OpenDocument format which is based on the XML format.  And I am convinced that in the long-term XML related standards are the way to go.  Maybe not today, but in a few years.  At least, I don't have a better idea at the moment.  

The success of the Internet is based on standards that are coming from the World Wide Web Consortium.  The "application" Internet is a very complicated task.  Speech recognition is a very complicated task, too.  So why not follow the path of success?  The whole world is about standards.  Standards, standards, standards.  Standards are everywhere.  And the W3C provides us with standards.  So, the result is that I don't have to care about difficult applications like HTK.  I just have to care about the W3C-standards.

You did the right thing to collect speech under the GPL.  But what are our next targets?  I would like to submit much more speech samples (prompts) in the English and in the German language employing the SSML, even if there isn't any demand at the moment.  

Maybe HTK, Sphinx, Julius do have their own standards.  But where should I start? Until now, I didn't find the time to get involved into the details of those tools.  I started to read the HTK book.  It is really not easy.  I do understand the concept of XML.  But HTK etc. are really very complicated.  So why go the complicated way, if there is an easy one?  The alternative would be if someone would release screencasts that explain HTK or Sphinx.  If it would be more easy to understand HTK or Sphinx, maybe I would think differently.

I would like to see some results.  And one goal could be to develop a German pronunciation lexicon (PLS, GPL; eventually IPA).  This goal is achievable (thanks to Timo) and understandable.  I understand the concept of the GPL.  And I understand the value of XML.  But difficult tools like Sphinx, this is something for specialists like nsh.

So I would suggest that you follow your goals (HTK, Sphinx, Julius).  And I follow mine (PLS, SSML).

Greetings, Ralf
PreviousNext