Click here to register.

Google Summer of Code Ideas Page

morphologically rich pronunciation dictionaries
User: timobaumann
Date: 3/4/2008 4:42 pm
Views: 3458
Rating: 35

Current pronunciation dictionaries are simple lists of words and their pronunciations. While this works well for morphologically simple languages such as English, this is unnatural, inefficient and unmaintainable for morphologically richer languages, such as all Roman languages, German, Slavic languages and even more so for agglutinative languages such as Finno-Ugric languages and Turkish.

The purpose of this project is to extend the W3C Pronunciation Lexicon Specification allowing for different parts of speech, define their default derivations and to also support irregular forms if these occur for specific word types. 

Solutions should contain an application to define allowed parts of speech and their behaviour for different languages and an ideally flexible application that can be used to build pronunciation resources for these languages.

Re: morphologically rich pronunciation dictionaries
User: a_tom
Date: 3/5/2008 6:41 am
Views: 222
Rating: 32

sir i have a sond knowledge of c/c++ , java and python.....i would like to help in this project but i'm new to open source......would you plese help me to get through this.....i'm abhineshwar from india

my mail is

Re: morphologically rich pronunciation dictionaries
User: timobaumann
Date: 3/6/2008 5:51 am
Views: 1238
Rating: 28

Hi Abhineshwar,

your programming skills definitely sound promising and I don't think that it would be a problem if you've never worked on open source before. 

At the same time, the task is not limited to programming, but equally concerned with system design. So it would be good if you have some knowledge about XML and linguistics. I suppose your mother tongue is not English? What is your mother tongue? Does it have a complex morphology? Do you know other foreign languages? All these are major factors in figuring out, if you'd be the right person for this particular task, or if we can find some other task you'd be interested in.