Speech Recognition Engines

Nested
Which one (Sphinx, ISIP, Julius, HTK)?
User: shengchieh
Date: 9/8/2009 1:44 pm
Views: 11775
Rating: 2

I'm thinking of trying VoxForge on this dinosaur desktop

(use #! [crunchbang], an ubuntu-based derivative).  I'm

looking at various speech engines, i.e., Sphinx, ISIP, Julius,

and HTK.  I'm interested in using voice recognition as an user,

not a researcher.  In other words, Dragon NaturallySpeaking

would be ideal except it is a window code.  Which speech

engine do you recommend (easiest to learn) OR they are

all research codes?  Are there other linux-based voice

recognition code(s) I should be looking at?

 

Thank in advanced for replies.

 

Sheng-Chieh

--- (Edited on 9/8/2009 1:44 pm [GMT-0500] by shengchieh) ---

Which one (Sphinx, ISIP, Julius, HTK)? Answer: simon
User: ralfherzog
Date: 9/8/2009 7:26 pm
Views: 104
Rating: 1

Hi! I suggest that you take a look into the simon handbook (PDF). After you have done that, you can download simon. It is working on my Ubuntu machine, so it is worth a try on your Ubuntu derivative.

In short: simon is something like a GUI for HTK, and Julius.

You should install HTK on your computer, too.

After you have installed HTK, and simon, you can import an English dictionary.

Regards,

Ralf

Question to the Voxforge team: Has anyone of you tried sam? It is available only via svn. I think that sam could help us with the development of acoustic models.

--- (Edited on 2009-09-08 7:31 pm [GMT-0500] by ralfherzog) ---

Re: Which one (Sphinx, ISIP, Julius, HTK)? Answer: simon
User: kmaclean
Date: 9/9/2009 8:43 pm
Views: 117
Rating: 1

Hi Ralf,

>... Has anyone of you tried sam? [...]  I think that sam could

>help us with the development of acoustic models.

Looks very interesting from an acoustic model testing perspective... thanks for letting us know about it,

Ken

--- (Edited on 9/9/2009 9:43 pm [GMT-0400] by kmaclean) ---

Re: Which one (Sphinx, ISIP, Julius, HTK)? Answer: simon
User: Mariane
Date: 9/14/2009 2:47 pm
Views: 129
Rating: 2

It is still the same problem: it says in Simon's manual "continuous, free dictation is neither supported nor reasonable with current versions of simon".


So does anyone knows of a speech-to-text which currently functions on spontaneous unconstrained speech? Because this could be used to generate examples which could then be used to train our open source systems, no? The samples would still have to be listened to, the errors corrected, and the files segmented, but imho it would go faster...


Please tell me. For example, how does dragon compare with the windows 7 speech-to-text?


Mariane

--- (Edited on 9/14/2009 2:47 pm [GMT-0500] by Mariane) ---

Re: Which one (Sphinx, ISIP, Julius, HTK)? Answer: simon
User: kmaclean
Date: 9/14/2009 3:12 pm
Views: 67
Rating: 2

Hi Marianne,

>So does anyone knows of a speech-to-text which currently functions

>on spontaneous unconstrained speech?

See Arthur Chan's article: Do we have a true open source dictation machine?

>Because this could be used to generate examples which could then be

>used to train our open source systems, no?

I think you are referring to Text-to-Speech (as opposed to speech recognition in your first question).  A speech recognition engine does not necessarily include a text-to-speech engine.  There are many.

This question was discussed here: Humans are great, but why not use commercial (and OSS) text to speech engines too?

>how does dragon compare with the windows 7 speech-to-text?

Dunno - we deal with free and open source speech recognition here :)

Ken

 

--- (Edited on 9/14/2009 4:12 pm [GMT-0400] by kmaclean) ---

Re: Which one (Sphinx, ISIP, Julius, HTK)? Answer: simon
User: Mariane
Date: 9/14/2009 3:20 pm
Views: 5116
Rating: 2

Me too :), I was only thinking of bootstrapping the process using unfree software.

I explained what I mean in the thread called "how to get more voice samples", summer of code section.


Mariane

--- (Edited on 9/14/2009 3:20 pm [GMT-0500] by Mariane) ---

--- (Edited on 9/14/2009 3:22 pm [GMT-0500] by Mariane) ---

PreviousNext