Speech Recognition Engines

HTK 3.5 DNN pretraining
User: kothemel
Date: 5/15/2016 1:31 pm
Views: 5849
Rating: 0

Greetings to the group,

I have stuck in the pre-training step of DNN-HMM.

First of all, the whole database (.wav, .mfc, .mlf files) is stored on an ext. drive. According to the HTKBook page 68 in order to proceed with descriminative pre-training you must execute:

HNTrainSGD -C config.basic -C config.pretrain -H dnn3/init/models -M dnn3  -S dnn.train.scp -N dnn.hv.scp -l LABEL -I train.mono.aligned.mlf  hmm_mono/monolist

After creating the config.basic, config.pretrain, dnn.train.scp, dnn.hv.scp and running the command I get the following error:


Epoch 1 ******************************

Processing training set...

 ERROR [+8925]  LoadOneUtt: Phone label in the label file does not match the state level definition

FATAL ERROR - Terminating program ../../bin.cpu/HNTrainSGD


Well i guess the problem is occured by the massive label file.

The train.mono.aligned.mlf, as I defined it, contains the paths to other mlfs. I quote a small part of it:


and goes on..

On the ext. drive the s01mic1.mlf contains the acoustic events as labels with the utterance that is taking place. I also quote a part of it:


Any help?

Thanks in advance!

--- (Edited on 5/15/2016 1:31 pm [GMT-0500] by kothemel) ---

Re: HTK 3.5 DNN pretraining
User: cz277
Date: 10/5/2016 6:25 pm
Views: 2331
Rating: 0


I just saw your question. It was because HNTrainSGD was expecting aligned MLFs (generally produced by HVite in HTK) as the timing info is necessary for cross-entropy training. You can find HTKBook tutorial session for the steps. 

To get faster reply, please register on the HTK website and send the questions directly to the HTK mailing list. We don't check other places for questions normally.



Best wishes,




--- (Edited on 10/5/2016 6:25 pm [GMT-0500] by Visitor) ---