Speech Recognition Engines

Flat
what do I have to do?
User: DarkAdmiral
Date: 5/10/2008 6:37 am
Views: 3896
Rating: 19

I'm using julius 3.5.2 quickstart. Everything seems to work. There is a line <<< please speak >>> and I have to read the sentences loudly again and again.

How long to I have to speak? I want julius to understand what I say and not what he wants me to say.

thank you,

Jan

--- (Edited on 5/10/2008 6:37 am [GMT-0500] by Visitor) ---

Re: what do I have to do?
User: kmaclean
Date: 5/10/2008 9:09 am
Views: 1724
Rating: 20

Hi Jan,

>I'm using julius 3.5.2 quickstart. Everything seems to work. There is a line

><<< please speak >>> and I have to read the sentences loudly again and

>again.

To confirm I understand what is going on ... your microphone works (i.e. you can use an audio editor like Audacity to record your voice), Julius is working, and you are saying the words in the grammar, and Julius recognizes them by echoing them on the screen. 

>How long to I have to speak? I want julius to understand what I say and not

>what he wants me to say.

I am not sure I understand what you mean here.  (1) Do you want Julius to recognize words that are not in the Grammar file, or (2) do you want a speech command and control system for your computer. 

(1) If the former, you need to update the grammar used by Julius.  The Grammar-notes file in the Quickstart explains:

The enclosed sample grammar files are for demonstration purposes only.  They allow the Julius speech recognition engine to recognize the following type of sentences (and many more):

* CALL STEVE YOUNG
* DIAL FIVE SEVEN EIGHT TWO
* DIAL THREE ZERO
* DIAL TWO TWO FOUR FOUR NINE ZERO SEVEN SEVEN
* PHONE JOE YOUNG JOHNSTON JOHNSTON JOHNSTON STEVE STEVE STEVE JOE
* CALL BOB JORDAN
* CALL STEVE
* CALL STEVE JOHNSTON
* PHONE STEVE
* DIAL OH OH FOUR FIVE SIX

Basically, the grammar is designed to recognize any combination of the numbers 1 through 9, ZERO and OH. You must precede numbers with the word 'DIAL' - as in "dial 1 2 3". It is also set up to recognize any combination of the following names: STEVE, YOUNG, BOB, JOHNSTON, JOHN, JORDAN, and JOE. You must precede the names with the words "PHONE" or "CALL" - as in "phone steve young" or "call johnston".

For help with Julian grammar syntax see
* the VoxForge tutorial step 1 at: http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial/data-prep/step-1; or
* the Julian grammar tutorial on the Julius web site at: http://julius.sourceforge.jp/en/grammar.html.

If you want to add more words to your grammar, you can only use the words (and phones) contained in the VoxForge Dictionary - see http://www.repository.voxforge1.org/downloads/builds to download the latest version. This was the dictionary used to train the acoustic model you are using. If you want to use words that are not included in this dictionary, you will likely need to recompile the acoustic model with audio that uses the words you want to add. The VoxForge how-to and tutorial can walk you the steps required to do this, see our web site at: http://www.voxforge.org/home/acousticmodels.

(2) For the latter, as stated in the notes on the Quickstart download page, a "Speech Recognition Engine" is only one component of a Speech Command and Control System (where you can speak a command and the computer does something).  You still need a Dialog Manager to understand what to do with the recognition results from the speech recognition engine - i.e. to take the words recognized by the Speech Recognition Engine, and make the computer do something useful.  VoxForge has not yet created a dialog manager for use with Julius.  The Simon project is working on this for julius.  PerlBox can be used with the Sphinx speech recognition engine.

Ken 

--- (Edited on 5/10/2008 10:09 am [GMT-0400] by kmaclean) ---

PreviousNext