Click here to register.

Edit Message

Visitor Name

Re: Your goals: HTK, Sphinx, Julius. My goals: PLS, SSML

Hi Ralph,

>And I am convinced that in the long-term XML related standards are the way

>to go.  Maybe not today, but in a few years. 

I agree. 

>I would like to submit much more speech samples (prompts) in the English

>and in the German language employing the SSML, even if there isn't any

>demand at the moment. 

Please note that SSML is only a markup language for directing what a text-to-speech engine says.  I don't think it used as a format for describing transcribed audio submitted for the creation of acoustic models. 

>I started to read the HTK book.  It is really not easy.

The HTK book is a difficult read... I have only read the first few chapters, and now only use it as a reference - I don't have the math skills to understand all the formulas and how they interact.  But if you look at HTK as a "black-box", and only focus on the minimum command set required to compile an acoustic model, then you can do quite a bit with trial and error - which essentially was my approach when I started out... :)

You might be interested in the W3C VoiceXML standard, which essentially merges subsets of the SSML, CCXML, SRGML specifications.   This doc: "Voice Browsers, Introduction" provides a good overview of how they all should work together. 

The jvoicexml project has implemented a working VoiceXML browser, which essentially provides a VoiceXML dialog manager front-end to Sphinx and Festival.  They might provide bindings to Asterisk (IP PBX).  Note that jvoicexml uses the JSAPI and JTAPI "standards" to accomplish this.