Click here to register.

Step 3 - Recording the data

Quiet Environment

Before you begin, you need to make sure that the room you are recording in is as quiet as possible.  In addition, make sure you turn off you speakers while recording - to avoid acoustic feedback in your audio files.

See this FAQ entry for more information. 

Adjust your Microphone 

Next you need to adjust your microphone for optimal pickup of your voice. 

If you have a headset microphone, this should be easy to do.  Your microphone should be a bit to the side and below your mouth (so the microphone won't pick-up your breathing), and no more than a half inch (1-2 cm) away.

See this FAQ entry for more information.

Recording Levels

Now you need to test your recording levels. 

Start Audacity. 

Make sure your microphone volume in Audacity is set to 1.0. 

Then click Record (i.e. the red circle button) and begin speaking in your normal voice for a few seconds, and then click Stop (i.e. the yellow square button). 

Look at the Waveform Display for the audio track you just created.  The Vertical Ruler to the left of the Waveform Display provides you with a guide to your audio levels.  Try to keep your recording levels between 0.5 and -0.5, averaging around 0.3 to -0.3.  It is OK to have a few spikes go outside the 0.5 to -0.5 range, but avoid having any go beyond the 1.0 to -1.0 range, as this will generate distortion.  If necessary, adjust Audacity's microphone volume to keep your audio within the proper ranges.

 

Configuring Audacity Preferences

VoxForge collects speech audio at the highest Sample Rate that your Sound Card can support (up to a Sampling Rate of 48kHz, at 16 Bits Per Sample).  You'll need to look at your Sound Card's manual to determine the maximum it supports.  If you don't have your manual, this FAQ entry might help you get the required information (another FAQ entry provides some caveats with respect to your sound card and recording rates).

In Audacity, you set the Project Sampling Rate and Sample Format in your Preferences (under the 'File' menu selection).  Click  the 'Quality' tab:

  • set your 'Default Sample Rate Format' by clicking the up/down arrows to change it to 48000Hz;
  • set your 'Default Sample Format' to 16-bit.

Next click the 'Audio I/O' tab, and then:

  • set your 'Channels' to 1 (Mono).

Then click the 'File Formats' tab, and then

  • set your 'Uncompressed Export Format' to WAV (Microsoft 16 bit PCM) or export your audio using FLAC format.
Note: Please only submit audio files in an uncompressed format such as WAV or AIFF or lossless compressed format such as FLAC.
 
Click OK to save your settings.

Now you need to exit and re-start Audacity to make these Project Setting changes active.  Look at Project rate selector on the bottom left hand corner of the Audacity window, make sure it says 48000. 

Recording your first Audio File

Prompts file

In Step 2 we created a "prompts" file that can now be used to guide you on which words you need to record for your individual speech audio files.

Each line of the prompts file corresponds to the transcribed contents of one audio file.  The first column contains the name of the audio file, and the following columns on the same line contain the text transcriptions of what is recorded in the audio file, see below:

*/sample1 DIAL ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE OH ZERO
*/sample2 DIAL ONE THREE FIVE SEVEN NINE ZERO TWO FOUR SIX EIGHT OH
*/sample3 DIAL ZERO NINE SEVEN FIVE THREE ONE OH EIGHT SIX FOUR TWO
*/sample4 DIAL ONE ONE TWO TWO THREE THREE FOUR FOUR FIVE FIVE
*/sample5 DIAL SIX SIX SEVEN SEVEN EIGHT EIGHT NINE NINE OH OH ZERO ZERO
*/sample6 PHONE STEVE YOUNG CALL STEVE YOUNG
*/sample7 PHONE STEVE CALL STEVE PHONE YOUNG CALL YOUNG
*/sample8 PHONE PHONE STEVE STEVE  CALL CALL YOUNG YOUNG
*/sample9 MEASURE LEISURE AND LEISURE MEASURE
*/sample10 COMPLAIN CHAMPLAIN AIRPLANE ELAINE EXPLAIN
*/sample11 BOOKENDS KENNEL KENNETH KENYA WEEKEND
*/sample12 BELT BELOW BEND AEROBIC DASHBOARD DATABASE
*/sample13 GATEWAY GATORADE GAZEBO AFGHAN AGAINST AGATHA
*/sample14 ABALON ABDOMINALS BODY ABOLISH
*/sample15 ABOUNDING ABOUT ACCOUNT ALLENTOWN
*/sample16 ACHIEVE ACTUAL ACUPUNCTURE ADVENTURE
*/sample17 ALGORITHM ALTHOUGH ALTOGETHER ANOTHER
*/sample18 BATTLE BEATLE LITTLE METAL
*/sample19 BITTEN BLATANT BRIGHTEN BRITAIN
*/sample20 BROOKHAVEN HOOD BROUHAHA BULLHEADS
*/sample21 BUSBOYS CHOICE COILS COIN
*/sample22 COLLECTION COLORATION COMBINATION COMMERCIAL
*/sample23 MIDDLE NEEDLE POODLE SADDLE
*/sample24 ALRIGHT ARTHRITIS BRIGHT COPYRIGHT CRITERIA RIGHT
*/sample25 COUPLE CRADLE CRUMBLE
*/sample26 CUBA CUBE CUMULATIVE
*/sample27 CURING CURLING CYCLING
*/sample28 CYNTHIA DANFORTH DEPTH
*/sample29 DIGEST DIGITAL DILIGENT
*/sample30 AMNESIA ASIA AVERSION BEIGE BEIJING
*/sample31 HELP HELLO HELMET HELPLESS AHEAD HELP


Audacity

To begin, you should not have any tracks displayed in the Audacity window.  If you do, click the x icon at the top left of the audio track display (or hit ctrl-z as many times as is required to remove them; or restart Audacity).  If you don't Audacity will happily record your new track, and leave your old track untouched, and when you export your audio to a wav file, both tracks will be merged to your wav file.

Make sure your volumes are set properly, as outlined in the preceding section.

Record you first file by clicking 'Record' in Audacity and saying the words in the first line of your prompts file:

DIAL ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE OH ZERO

Speak normally - not too slow or too fast - and clearly.  Pause slightly before you begin speaking and leave a short pause after you have completed (i.e. half a second pause before and after you speak).  Remember not to breath out until you have clicked stop - most mikes pick up breathing noises.

Click the 'Stop' icon when you are completed. 

Review your waveform to ensure that highest and lowest peaks of your recording is between 0.5 and 1.0 in the upper range and the lower range is between -0.5 and -1.0.  If they are, then listen to the file (press 'Play' in Audacity) to make sure your pronunciation is clear and that you do not hear any non-speech noises (i.e. breathing noises, lip smacking, or background noises, ...).  If there are any problems, hit ctrl-z and re-record your file.

If the file sounds OK then create a new folder in the 'voxforge' directory in your home directory and call it 'train', then create a sub-folder in your 'train' directory called 'wav'.  Next, in Audacity click 'File' on the Audacity menu, then click 'Export As Wav' and save it as the name of the file listed in your prompts file ('sample1' in this case) in the '[your home directory]/voxforge/train/wav' folder you just created.

Recording your remaining Audio Files

Repeat the same process for each line in your prompts file.  When you are completed, you should have a series of audio files corresponding to all the files listed in the first column of the prompts file your 'voxforge/train/wav' folder.

Comments

AddSearch

By Asterisk-User - 5/14/2013 - 1 Replies While up-sampling such that a recorded audio file is recreated with a higher sampling rate, e.g. file.wav 8kHz and recreated and saved to a 16kHz file) is generally not a good idea. This could be possible whilst using a smoothening algorithm in order to produce a better quality audio file. Think of it as a tweening process (interpolation). Of course, this could be prone to errors - but possible.

By Lennart - 6/2/2012 If you recorded in a format that is of higher quality you can get away with converting down, but if you recorded in a too low Hz or with too few bits per sample then you will not be able to increase the quality.

By mmm - 8/7/2010 - 1 Replies hi