Click here to register.

Languages

VoxForge Downloads

QuickStart download

This QuickStart download was designed to highlight the use of VoxForge Acoustic Models with Open Source Speech Recognition Engines.  We will start with a download that uses the Julius Speech Recognition Engine.   

These downloads contain everything you need to get Julius working:

  • Julius Speech Recognition Engine executables;
  • VoxForge Acoustic Model files;
  • VoxForge sample Grammar files; and
  • Cygwin executables (Windows only).

Click the link for your operating system for the and follow the instructions in the enclosed README file.  You can also try one of the Nightly Builds for the most up to date version of the Acoustic Models.

Notes:

1. The Acoustic Models currently included with the QuickStart Downloads are still in alpha stage, so the recognition quality reflects this.  We need much more GPL transcribed speech audio to create decent quality Acoustic Models.  So please take the time to submit some transcribed speech to VoxForge.  Go to the Read page for instructions on how to do this.

2.  A "Speech Recognition Engine" (like Julius) is only one component of a Speech Command and Control System (where you can speak a command and the computer does something).  You still need a Dialog Manager to understand what to do with the recognition results from the speech recognition engine (i.e. to take the words recognized by the Speech Recognition Engine, and make the computer do something useful).  VoxForge has not yet created a dialog manager for use with Julius. 

Comments

AddSearch

By oxydenz - 12/26/2013 Hi all,

By Anu - 10/18/2013 when i run the command

By Anu - 10/17/2013 i am new to sphinxtrain.... when i try to run thecommand

By Anu - 10/17/2013 i am new to sphinxtrain, i am trying to develop the acoustic model for isolated words

By ANJU - 10/16/2013 HI

By hstech - 8/4/2013 - 2 Replies Hello ! I got a question, how exactly did you produce the files in the http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/ directory ? I know that they are somehow produced from the "Original" secion of the downloads because the files have the same names as the files in the "Original" section and the content appears to sound exactly the same. I immediately identified that the files are downsampled but I could not figure out how they were downsampled (appears like some elaborate algorithm was used to do the downsampling). I am trying to build some sort of automatic downsampling tester (the script takes the original data, downsamples them using sox to the samplerate I tell it to use, and then runs training of the model on these downsampled data). I wanted to use the data from http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/ as test data of sorts (tell the script that the target samplerate is 8/16 kHz, let it generate the data and then compare the generated data to these files) but I realized that I am unable to reproduce the process used to produce these data. I suspected that you used "sox" to derive these files from their original files, so I installed sox 14.4.1 and tried to use "sox b0284.wav test.wav rate 16000" but the resulting file differs from that of yours way too much. I was thinking that the dithering is the cause but then I did some experimenting and I realized that the dithering introduces only very subtle differences into the file, which are not consistent with the differences that are between my file and your file. Reading the sox documentation revealed that the sox resampling algorithm is "very configurable" and there were about 2 options which accept over 30 different settings and the response of the algorithm to the changes of these options is wildly different so there is no way to intelligently search for the "correct" values of these options. So I need to know whether you are using "sox" and what version you are using and what options you use when downsampling the files.

By Josberto - 3/3/2013 - 1 Replies How I can trainning with esperanto language, please?

By Melba - 1/28/2013 Hi....I want to use MAP technique for speaker adaptation. I dont know what configuration file I need to use. I tried to use the HERest command with the -u p option. pls help...

By volkerbradley - 1/2/2013 - 7 Replies Just tried to use the julian quickstart on my 64-bit Ubuntu Linux 12/04 system.

By Asim12 - 12/25/2012 - 2 Replies Hi

By isha - 12/24/2012 - 6 Replies i am using windows 7 and wish to run htk3.4.1 on my system but while running demo

By Nasir - 9/7/2012 - 1 Replies Has any one done some work on server side on this? I am looking for the development of a speech recognition system and looking at various tools.