Click here to register.

Edit Message

Visitor Name
Subject
Message

Re: Your goals: HTK, Sphinx, Julius. My goals: PLS, SSML

Hi Ralph,

>I will take a look into the Introduction and Overview of W3C Speech Interface Framework

Good link, I didn't see that particular one...

>I am looking for "a format for describing transcribed audio."

I am not sure there is a any such format on the W3C site for this.  The LDC might have something.  With XML, one could be created fairly easily.  But in VoxForge's case, there would be quite a few scripting changes on the acoustic model creation backend that would required to implement such a thing.

>  Perhaps it is VoiceXML, I am not sure at the moment, I will read about it.

VoiceXML is a language to describe spoken dialogs... think of spoken interactive voice response (IVR) systems in a telephone environment (which is what VoiceXML was originally designed for).  For example, when I call my ISP, I used to use keypad sequences to get routed to the help desk.  Now I call their number, and just say "Internet technical support" on my phone, and get routed to the help desk queue. 

A VoiceXML browser "abstracts" away all the differences between the different implementation of:

and lets you describe a call flow using a standard language (VoiceXML). 

There have been a few open source implementations of VoiceXML that implemented the text to speech and the telephony components.  But most attempts to implement the speech recognition portion failed - because it is very difficult to do.  jvoiceXML is amazing since they got the speech rec component working (though I have not tried it out myself).  I think using JSAPI was an excellent way to avoid having to work out the details of a particular speech rec or tts engine, but I am not sure of where Sun's JSAPI licensing is currently at.  

>I solved one problem, and then the next problem occurred.  I stopped

>trying.  Maybe I will try it again.

Don't give up yet, if that is what you are interested in.  It takes some effort.  A bit of understanding of a scripting language is also very helpful.  

Ken