Step 5 - Coding the (Audio) Data
Create Codetrain.scp
HTK calls this last step in data
preparation the "parameterizing the raw speech waveforms into sequences
of feature vectors". All this means is that HTK is not as
efficient in processing wav files as it is with its internal
format. Therefore, you need to convert you audio wav files to
another format called MFCC format (which refers to Mel Frequency Cepstral Coefficients ; which are more generally referred to as 'feature vectors ').
You
use the HCopy tool to convert your wav files to MFCC format. You
have 2 options. You could execute the HCopy command by hand for
each audio file you created in Step 3, or you can create a file
containing a list of each source audio file and the name of the MFCC
file it will be converted to, and use that file as a parameter to the
HCopy command. We will use the second approach in this
example. Create the following 'codetrain.scp' script file in your
'voxforge/manual' folder:
codetrain.scp
Config file
The
HCopy command performs the conversion from wav format to MFCC. To
do this, a configuration file (config) which specifies all the needed
conversion parameters is required. Create a file called wav_config in your
'voxforge/manual' folder and add the following:
SOURCEFORMAT = WAV TARGETKIND = MFCC_0_D TARGETRATE = 100000.0 SAVECOMPRESSED = T SAVEWITHCRC = T WINDOWSIZE = 250000.0 USEHAMMING = T PREEMCOEF = 0.97 NUMCHANS = 26 CEPLIFTER = 22 NUMCEPS = 12
If
you would like more details on the contents of the config file, please
see the HTK documentation.
Create a new directory called 'mfcc' in your 'voxforge/train' folder. Then execute HCopy from your
'voxforge/manual' folder as follows:
$HCopy -A -D -T 1 -C wav_config -S codetrain.scp
The
result is the creation of a series of mfc files corresponding to the
files listed in your codetrain.scp script in the "voxforge/train/mfcc" folder. Be sure to monitor the
output of the HCopy command to ensure that all wav files get processed
properly. Most problems are related to file paths or audio files
in a non-wav format.
Comments
Click the 'Add' link to add a comment to this page; click the 'Read More' link to view replies to a posted comment.
Add
•
Search
untitled
By Stewie
-
11/16/2011
- 1 Replies
I was also in trouble about that issue
HCopy -A -D -T 1 -C ./input_files/wav_config -S ../codetrain.scp HTK Configuration Parameters[14] Module/Tool Parameter Value # RAWENERGY FALSE # TRACE 0 # ESCALE 1.000000 # ENORMALISE FALSE # ZMEANSOURCE TRUE # NUMCEPS 12 # NUMCHANS 24 # PREEMCOEF 0.970000 # USEHAMMING TRUE # WINDOWSIZE 250000.000000 # TARGETRATE 100000.000000 # TARGETKIND MFCC_E_D_N_Z # SOURCERATE 625 # SOURCEFORMAT WAV ERROR [+1019] SetConfParms: incompatible TARGETKIND=MFCC_E_D_N_Z for coding FATAL ERROR - Terminating program HCopy
But, I could sort of solve it, after I made all wav files; sample1 to sample31 following in step3.
Why don't you make all sample voice wav files, I guess you just made only one or some files same as me, didn't you.
unable to set MFCC_E_D_N_Z as target for coding
By ashwin
-
6/22/2011
- 1 Replies
i am not able to set MFCC_E_D_N_Z as the target for coding the wav files.
I get the following ERROR:
HCopy -A -D -T 1 -C ./input_files/wav_config -S ../codetrain.scp HTK Configuration Parameters[14] Module/Tool Parameter Value # RAWENERGY FALSE # TRACE 0 # ESCALE 1.000000 # ENORMALISE FALSE # ZMEANSOURCE TRUE # NUMCEPS 12 # NUMCHANS 24 # PREEMCOEF 0.970000 # USEHAMMING TRUE # WINDOWSIZE 250000.000000 # TARGETRATE 100000.000000 # TARGETKIND MFCC_E_D_N_Z # SOURCERATE 625 # SOURCEFORMAT WAV ERROR [+1019] SetConfParms: incompatible TARGETKIND=MFCC_E_D_N_Z for coding FATAL ERROR - Terminating program HCopy
Please Help me out
Configuration file
By gbernardi
-
3/3/2011
- 1 Replies
Hi, I don't understand why it is not a problem that the configuration files we use for coding the data and the one used by HcompV do not have the same TARGETKIND...
Any suggestions?
problem with the last part!
By Hossein Khaki
-
2/8/2011
- 1 Replies
Hi
I do all first 5 steps correctly but when I want to do the 5st step for this command "
$HCopy -A -D -T 1 -C wav_config -S codetrain.scp
I find these errors:
HCopy -A -D -T 1 -C wav_config -S codetrain.scp HTK Configuration Parameters[11] Module/Tool Parameter Value # NUMCEPS 12 # CEPLIFTER 22 # NUMCHANS 26 # PREEMCOEF 0.970000 # USEHAMMING TRUE # WINDOWSIZE 250000.000000 # SAVEWITHCRC TRUE # SAVECOMPRESSED TRUE # TARGETRATE 100000.000000 # TARGETKIND MFCC_0_D # SOURCEFORMAT WAV ../train/wav/sample1.wav -> ../train/mfcc/sample1.mfc ERROR [+6251] Only standard PCM, mu-law & a-law supported ERROR [+6213] OpenWaveInput: Get[format]HeaderInfo failed ERROR [+6313] OpenAsChannel: OpenWaveInput failed ERROR [+6316] OpenBuffer: OpenAsChannel failed ERROR [+1050] OpenParmFile: Config parameters invalid FATAL ERROR - Terminating program HCopy
by the way I think all parameter about my wave file is correct! here are the parameters:
codec: Uncompressed 16-bit PCM audio
channels: Mono
Sample rate: 48000 Hz
Bitrate: N/A
Please tell me where could I find the problem?
thanks
Doubt regarding the output of hcopy
By Tom George
-
11/28/2010
- 2 Replies
I used to hcopy command to extract the features of a wave file. But instead of getting numerical output i got a mfc file of illegible characters.
Can anyone help me solve this problem?
problems with wav format
By j17
-
9/27/2010
- 2 Replies
Hello everybody,
It is the first time I use HTK for speech recognition, and I'm trying to learn how to build an HMM model. But, I have one problem and I do not how to solve it. It is related to my speech files. I have converted my source files from RAW format to WAV in order to use them with HTK (using sox under UNIX), and before typing HCopy to create the mfc files, I would like to listen my wav files, using play, but an error occurs. Could anybody help me? Thank you.
play prueba.wav ALSA lib confmisc.c:768:(parse_card) cannot find card '0' ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_card_driver returned error: No such file or directory ALSA lib confmisc.c:392:(snd_func_concat) error evaluating strings ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory ALSA lib confmisc.c:1251:(snd_func_refer) error evaluating name ALSA lib conf.c:3513:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:3985:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2184:(snd_pcm_open_noupdate) Unknown PCM default sox: Failed writing default: cannot open audio device
Problem with copy
By bejimed
-
8/13/2010
- 1 Replies
can anyone help me
this is the error i got
thank you
Copy -A -D -T 1 -C wav_config -S codetrain.scp ERROR [+5010] SetScriptFile: Cannot open script file codetrain.scp ERROR [+5020] InitShell: SetScriptFile failed on file codetrain.scp ERROR [+1000] HCopy: InitShell failed FATAL ERROR - Terminating program HCopy
Hcopy output
By novision
-
5/20/2010
- 2 Replies
Hii all,
I am trying to make a decoder for myself. I need the feature parameters from htk.
I have used the same wav_config file as given in the tutorial for my HTK recognition software.But when i try to see the feature parameters(.mfc file) using the command HList , it shows 26 vectors per frame instead of 25(mfcc(12) + delta (12) + 0th(1),which it should show).
sample1.mfc
------------------------------------ Samples: 0->-1 ------------------------------------ 0: -18.144 3.699 0.551 2.406 6.086 6.956 3.845 5.387 11.362 7.867 6.097 2.205 68.196 0.183 -0.022 -0.345 0.204 -1.470 -1.194 0.097 -0.471 -1.137 -0.228 -0.189 0.845 0.130 1: -18.078 2.801 -0.185 3.511 2.543 3.737 6.169 5.410 7.749 7.631 6.462 7.283 68.259 -0.076 -0.009 0.427 0.709 -1.433 -1.166 0.350 -0.488 -2.145 -0.693 -0.609 0.688 0.029
Am i wrong here conceptually?
If not then which vectors to take while calculating the observation probabilites for continuos densities(because we have used a proto having 25 vectors in the next step).
Plz help me understand htk..
thnx
Running HCopy
By Gothrog
-
3/22/2010
- 4 Replies
This is my first time running through this example on Step5.
I may have saved the wav files with 32 bits instead of 16 bits. Would this have done something?
$ /usr/local/bin/HCopy -A -D -T 1 -C wav_config -S codetrain.scp /usr/local/bin/HCopy -A -D -T 1 -C wav_config -S codetrain.scp HTK Configuration Parameters[11] Module/Tool Parameter Value # NUMCEPS 12 # CEPLIFTER 22 # NUMCHANS 26 # PREEMCOEF 0.970000 # USEHAMMING TRUE # WINDOWSIZE 250000.000000 # SAVEWITHCRC TRUE # SAVECOMPRESSED TRUE # TARGETRATE 100000.000000 # TARGETKIND MFCC_0_D # SOURCEFORMAT WAV ERROR [+6251] Only standard PCM, mu-law & a-law supported ERROR [+6213] OpenWaveInput: Get[format]HeaderInfo failed ERROR [+6313] OpenAsChannel: OpenWaveInput failed ERROR [+6316] OpenBuffer: OpenAsChannel failed ERROR [+1050] OpenParmFile: Config parameters invalid FATAL ERROR - Terminating program /usr/local/bin/HCopy
TIMIT wav file have problems!!
By spring
-
12/23/2009
- 1 Replies
Hi,ken,I ask you for help!
Today,I used Hcopy to code timit wav file ,but I got the error :ERROR[+6251] GetTIMITHeaderInfo:Bad Numbers in TIMIT format header.I try to change the "SOURCEFORMAT" to "WAV"、"HTK"、"TIMIT",But still can't code my timit wav file.So HELPLESS!
Anybody who had used timit wav file for training?NEED your help!
May I have your email please,ken?So I can send your some of my timit wav file for analyzing!Thank you very much!
spring
Re: need your help
By Amit Surana
-
5/30/2009
Please check the path of wav files in codetrain.scp
Suppose your codetrain.scp is in voxforge/manual and wave files in voxforge/train/wav.
Then path shud be :
../train/wav/sample1.wav ../train/mfcc/sample1.mfc
need some help
By joshua
-
3/30/2009
- 3 Replies
Hi i would need some help on this, can anyone help me?
This is the error i got.
Thank you!
$ HCopy -A -D -T 1 -C wav_config -S codetrain.scp
C:\cygwin\HTK\htk-3.3-windows-binary\htk\HCopy.exe -A -D -T 1 -C wav_config -S codetrain.scp
HTK Configuration Parameters[11]
Module/Tool Parameter Value
# NUMCEPS 12
# CEPLIFTER 22
# NUMCHANS 26
# PREEMCOEF 0.97000
# USEHAMMING TRUE
# WINDOWSIZE 250000.000000
# SAVEWITHCRC TRUE
# SAVECOMPRESSED TRUE
# TARGETRATE 100000.000000
$
# SOURCEFORMAT WAV
ERROR [+6210] OpenWaveInput: Cannot open waveform file ../train/wave/sample1.wav
FATAL ERROR - Terminating program C:\cygwin\HTK\htk-3.3-windows-binary\htk\HCopy.exe