Hi DRF,
My apologies for the delay in gettng back to you on this, I am travelling...
>I would likeh to adapt the acoustic model with an extended
>recording of my voice rather than just a few prompts.
Both Sphinx and HTK require segmented speech for acoustic model training. You can still use a long passage of transcribed speech audio, you just need to segment it either manually (Automated Audio Segmentation Using Forced Alignment (Draft)) or use you can use the Perl script I created for this purpose (though it is still only alpha code) - documentation is POD based.
Ken
Hi Ken. Thanks for pointing me towards that draft page. I'm running into the following error when I try to do the forced alignment with my own recording:
Error [+6510] LOpen: Unable to open label file downsampled.lab
Any idea what might be causing the error? It seems to work ok with the example files given on the draft page of audio segmenting.
Thanks,
Dan
Hi DRF,
>Error [+6510] LOpen: Unable to open label file downsampled.lab
HTK is looking for the label file - which is the list of words in your text (in the order they appear in your text).
The label file is contained in your words.mlf file. Usually when creating acoustic models, you have many transcribed audio files. So rather than have many text label files to keep track of, you put them all in a single mlf file (mlf=multiple label file).
HTK assumes that the name of your label file is the same as your wav file - with the ".wav" suffix changed to ".lab". So make sure the prefix of your label file matches that of your audio file.
Ken