I have prepared all model needed to run Julius. The contents of jconf file (julius.jconf) to run julius are:
-nlr model/grammar # 2-gram
-v dict
-h hmm15/hmmdefs
-hlist tiedlist
-gprune safe # safe pruning ãˆÊNŒÂ‚ªŠmÀ‚É‹‚Ü‚éD³ŠmD
-n 10
-output 10 # ‘æ2ƒpƒX‚ÅŒ©‚‚©‚Á‚½•¶‚Ì‚¤‚¿o—Í‚·‚é” i•¶”j
-input rawfile # ‰¹º”gŒ`ƒf[ƒ^ƒtƒ@ƒCƒ‹(ƒtƒH[ƒ}ƒbƒg©“®”»•Ê)
# Œ`®FWAV(16bit) ‚Ü‚½‚Í
# RAW(16bit(signed short),mono,big-endian)
# 16kHzˆÈŠO‚̃tƒ@ƒCƒ‹‚Í -smpFreq ‚Åü”g”w’è
-filelist listfile.txt # ”F¯‘ÎÛƒtƒ@ƒCƒ‹‚ÌƒŠƒXƒg
-zmean # DC¬•ª‚Ìœ‹‚ðs‚¤ (-input mfcfile–³Œø)
-rejectshort 100 # w’èƒ~ƒŠ•bˆÈ‰º‚Ì’·‚³‚Ì“ü—Í‚ðŠü‹p‚·‚é
-lv 10000 # ƒŒƒxƒ‹‚Ì‚µ‚«‚¢’l (0-32767)
-smpFreq 16000 # ƒTƒ“ƒvƒŠƒ“ƒOü”g”(Hz)
-smpPeriod 625 # ƒTƒ“ƒvƒŠƒ“ƒOüŠú(ns) (= 10000000 / smpFreq)
-demo # "-progout -quiet" ‚Æ“¯‚¶
When I run julius by typing:
$julius -C julius.jconf
There were errors as follows:
STAT: include config: julius.jconf
STAT: jconf successfully finalized
STAT: *** loading AM00 _default
Stat: init_phmm: Reading in HMM definition
Stat: rdhmmdef: ascii format HMM definition
Stat: rdhmmdef: limit check passed
Stat: check_hmm_restriction: an HMM with several arcs from initial state found:
"sp"
Stat: rdhmmdef: this HMM requires multipath handling at decoding
Stat: init_phmm: defined HMMs: 17
Stat: init_phmm: logical names: 46 in HMMList
Stat: init_phmm: base phones: 17 used in logical
Stat: init_phmm: finished reading HMM definitions
STAT: making pseudo bi/mono-phone for IW-triphone
Stat: hmm_lookup: 28 pseudo phones are added to logical HMM list
STAT: *** AM00 _default loaded
STAT: *** loading LM00 _default
Error: voca_load_htkdict: line 1: triphone "uh-ah+sp" not found
Error: voca_load_htkdict: line 1: triphone "ah-sp+*" or biphone "ah-sp" not foun
d
Error: voca_load_htkdict: the line content was: DUA [DUA]
d uh ah sp
Error: voca_load_htkdict: line 2: triphone "ah-t+sp" not found
Error: voca_load_htkdict: line 2: triphone "t-sp+*" or biphone "t-sp" not found
Error: voca_load_htkdict: the line content was: EMPAT [EMPAT]
ax m p ah t sp
Error: voca_load_htkdict: line 3: triphone "ah-m+sp" not found
Error: voca_load_htkdict: line 3: triphone "m-sp+*" or biphone "m-sp" not found
Error: voca_load_htkdict: the line content was: ENAM [ENAM]
ax n ah m sp
Error: voca_load_htkdict: line 4: triphone "oh-ng+sp" not found
Error: voca_load_htkdict: line 4: triphone "ng-sp+*" or biphone "ng-sp" not foun
d
Error: voca_load_htkdict: the line content was: KOSONG [KOSONG]
k oh sh oh ng sp
Error: voca_load_htkdict: line 5: triphone "m-ah+sp" not found
Error: voca_load_htkdict: line 5: triphone "ah-sp+*" or biphone "ah-sp" not foun
d
Error: voca_load_htkdict: the line content was: LIMA [LIMA]
l ih m ah sp
Error: voca_load_htkdict: line 6: triphone "t-uh+sp" not found
Error: voca_load_htkdict: line 6: triphone "uh-sp+*" or biphone "uh-sp" not foun
d
Error: voca_load_htkdict: the line content was: SATU [SATU]
sh ah t uh sp
Error: voca_load_htkdict: line 9: triphone "g-ah+sp" not found
Error: voca_load_htkdict: line 9: triphone "ah-sp+*" or biphone "ah-sp" not foun
d
Error: voca_load_htkdict: the line content was: TIGA [TIGA]
t ih g ah sp
Error: voca_load_htkdict: begin missing phones
Error: voca_load_htkdict: ah-m+sp
Error: voca_load_htkdict: ah-sp+* or biphone ah-sp
Error: voca_load_htkdict: ah-t+sp
Error: voca_load_htkdict: g-ah+sp
Error: voca_load_htkdict: m-ah+sp
Error: voca_load_htkdict: m-sp+* or biphone m-sp
Error: voca_load_htkdict: ng-sp+* or biphone ng-sp
Error: voca_load_htkdict: oh-ng+sp
Error: voca_load_htkdict: t-sp+* or biphone t-sp
Error: voca_load_htkdict: t-uh+sp
Error: voca_load_htkdict: uh-ah+sp
Error: voca_load_htkdict: uh-sp+* or biphone uh-sp
Error: voca_load_htkdict: end missing phones
Error: init_voca: error in reading dict: 7 words failed out of 2 words
ERROR: m_fusion: failed to read dictionary, terminated
ERROR: m_fusion: failed to initialize dictionary
ERROR: Error in loading model
I knew that those missing phones didn't exist in "triphones1" file, but this file was generated by HTKToolkit. I think this problem occurs because of the content of "dict" file that consists of list of words like this: "DUA [DUA] d uh ah sp", which is generated by HTKToolkit and already consists of "sp". Can you help me to solve this problem? Thank you very much.
regards,
Amalia Zahra
My first question above is about the problem when I executed julius by using 2 gram language model (grammar). But, when I executed julian by using sample.dfa and sample.dict, the cygwin displayed the result as follows:
Stat: adin_file: input speechfile: wav_indonesia/1406-answer-00.wav
Warning: strip: sample 33944-33960 is invalid, stripped
Warning: strip: sample 42532-42548 is invalid, stripped
Warning: strip: sample 60120-60136 is invalid, stripped
Warning: strip: sample 123304-123320 is invalid, stripped
Warning: strip: sample 132844-132860 is invalid, stripped
Warning: strip: sample 134184-134204 is invalid, stripped
STAT: 143571 samples (8.97 sec.)
STAT: ### speech analysis (waveform -> MFCC)
pass1_best: <s> LIMA EMPAT EMPAT DUA ENAM SATU TIGA TIGA TIGA TIGA
pass1_best: <s> LIMA EMPAT EMPAT DUA ENAM SATU TIGA TIGA TIGA TIGA LIMA ENAM SAT
U </s>
sentence1: <s> EMPAT SATU </s>
What is the meaning of warning above? How can I fix it? Then..when I ran julian with another wav input file, but the same content with wav file already trained, julius recognized it incorrectly. Does julius/julian only recognize wav input file that has been trained before? Can't julius/julian recognize other files never been trained, but consists of same content with wav files already trained before. I really need your help for my questions both. Thanks.
regards,
Amalia Zahra
Hi amza,
>My first question above is about the problem when I executed julius by using 2
>gram language model (grammar).
Sorry, I don't have much experience running Julius in dictation mode - I can't help you on this one.
>What is the meaning of warning above? How can I fix it?
Is there a problem with the audio in "1406-answer-00.wav"? Can you listen to the speech audio in an audio editor? Is there lots of background noise? Was it recorded with a good quality microphone?
Have you looked at the Julius/Julian code to see what might cause the error?
You might try emailing Julius support directly - I've never seen this error before.
>Does julius/julian only recognize wav input file that has been trained before?
No.
If you have created a good triphone acoustic model (using the steps in the Tutorial) Julian should recognize short phrases containing words that were not necessarily included in the training set.
However, when you don't have enough training data, then you might be limited to only the words you trained with.
Ken