Click here to register.


Comments

Click the 'Add' link to add a comment to this page; click the 'Read More' link to view replies to a posted comment.

AddSearch

How to improve recognition
By kmaclean - 4/18/2008

If you have a small grammar, the following things can help improve recognition performance:

  • using a noise model,
  • using word-based hmm with more states (rather than phone-based hmm),
  • not using context-independent models

 

how to discard useless result?
By manio - 5/19/2008 - 2 Replies

Julian will find out the best fit sentence in the grammar every time I speak out,EVEN a noise happen,it will give a result.

How do I know a result is generated from a purposeful voice or a noise?

use the score1? but I found that the score of the purposeful voice and the score of the noise are nearly the same.where is the bounds?

is there another way?

thanks 

noise's result
--------------------------------------

pass1_best: <s> <n><num>1</num>ZYF</n>
pass1_best_wordseq: 0 2
pass1_best_phonemeseq: sil | jh ow r er n f aa
pass1_best_score: -10684.339844

length: 318 frames (1.06 sec.)
### Recognition: 2nd pass (RL heuristic best-first with DFA)
samplenum=318
sentence1: <s> <n><num>3</num>ZMY</n> </s>
wseq1: 0 2 1
phseq1: sil | jh aa ng m ae ng y iy | sil
cmscore1: 1.000 0.639 1.000
score1: -13599.010742
6 generated, 6 pushed, 4 nodes popped in 318

 purposeful voice
---------------------------------------------------------

pass1_best: <s> <n><num>1</num>ZYF</n>
pass1_best_wordseq: 0 2
pass1_best_phonemeseq: sil | jh ow r er n f aa
pass1_best_score: -12398.662109

length: 386 frames (1.28 sec.)
### Recognition: 2nd pass (RL heuristic best-first with DFA)
samplenum=386
sentence1: <s> <n><num>2</num>LDH</n> </s>
wseq1: 0 2 1
phseq1: sil | y ow b aa hh aa | sil
cmscore1: 1.000 1.000 1.000
score1: -14228.536133
8 generated, 8 pushed, 4 nodes popped in 386

Congratulation
By royerfa - 2/27/2008 - 8 Replies

Thanhs a lot for this tutorial,

It is a really good help to start using SRE.

I do the tutorial and in fact I am not really satisfied of the Julian Result. He recognize less than one sentence on four.

Quite bad result no.

I record the sample using audacity at a sample rate of 98000Hz. Maybe it is the cause of my problem, what do do think ?

But I don't forget to change the sampling rate in Jconf.

What shoulld I do to improve the recognition.

THX

FabWink

error when running Julius
By amza - 1/15/2008 - 1 Replies

First of all, I say thanks a lot for your help before. I am really supported by your answers on my questions. Now, I have already trained several Indonesian audio files to get hmm models. Then, I get language model, that is 2-gram language model (ARPA format), by using a tool provided in "http://www.speech.cs.cmu.edu/tools/lm.html".

I use all of those models, acoustic model and language model, with Julius to test some speech. The content of jconf file (julius.jconf) I use is:

-nlr model/grammar
-v dict
-h hmm15/hmmdefs
-hlist tiedlist
-gprune safe     
-input rawfile        # ‰¹º”gŒ`ƒf[ƒ^ƒtƒ@ƒCƒ‹(ƒtƒH[ƒ}ƒbƒgŽ©“®”»•Ê)

            # Œ`Ž®FWAV(16bit) ‚Ü‚½‚Í
            #    RAW(16bit(signed short),mono,big-endian)
            #    16kHzˆÈŠO‚̃tƒ@ƒCƒ‹‚Í -smpFreq ‚ÅŽü”g”Žw’è
-filelist listfile.txt    # ”Fޝ‘Ώۃtƒ@ƒCƒ‹‚ÌƒŠƒXƒg
-smpFreq 16000        # ƒTƒ“ƒvƒŠƒ“ƒOŽü”g”(Hz)
-smpPeriod 625    # ƒTƒ“ƒvƒŠƒ“ƒOŽüŠú(ns) (= 10000000 / smpFreq)
-demo            # "-progout -quiet" ‚Æ“¯

When I run Julius by typing: "julius -C julius.jconf", there are errors (I show you the whole results) as follows:

$ julius -C julius.jconf
STAT: include config: julius.jconf
STAT: jconf successfully finalized
STAT: *** loading AM00 _default
Stat: init_phmm: Reading in HMM definition
Stat: rdhmmdef: ascii format HMM definition
Stat: rdhmmdef: limit check passed
Stat: check_hmm_restriction: an HMM with several arcs from initial state found:
"sp"
Stat: rdhmmdef: this HMM requires multipath handling at decoding
Stat: init_phmm: defined HMMs:    23
Stat: init_phmm: logical names:   117 in HMMList
Stat: init_phmm: base phones:    23 used in logical
Stat: init_phmm: finished reading HMM definitions
STAT: making pseudo bi/mono-phone for IW-triphone
Stat: hmm_lookup: 92 pseudo phones are added to logical HMM list
STAT: *** AM00 _default loaded
STAT: *** loading LM00 _default
Stat: init_voca: read 23 words
Stat: init_ngram: reading in ARPA forward n-gram from model/grammar
Stat: ngram_read_arpa: this is 2-gram file
Stat: ngram_read_arpa: reading 1-gram part...
Stat: ngram_read_arpa: read 19 1-gram entries
Stat: ngram_read_arpa: reading 2-gram part...
Stat: ngram_read_arpa: 2-gram read 0 (0%)
Stat: ngram_read_arpa: 2-gram read 32 end
Stat: ngram_compact_context: bigram bowt compaction: 32 -> 0
Error: mymalloc: failed to reallocate 0 bytes

Before that, I have also experienced "Error: ngram_compact_context: 2-gram has no upper 3-gram, but not 0.0 back-off w
eight".

Could you help me to solve those two errors? Thanks a lot.

 

regards,

Amalia zahra 

error in loading model when executing julius
By amza - 1/10/2008 - 2 Replies

I have prepared all model needed to run Julius. The contents of jconf file (julius.jconf) to run julius are:

-nlr model/grammar        # 2-gram
-v dict
-h hmm15/hmmdefs
-hlist tiedlist
-gprune safe        # safe pruning ãˆÊNŒÂ‚ªŠmŽÀ‚É‹‚Ü‚éD³ŠmD
-n 10
-output 10        # ‘æ2ƒpƒX‚ÅŒ©‚‚©‚Á‚½•¶‚Ì‚¤‚¿o—Í‚·‚鐔 i•¶”j
-input rawfile        # ‰¹º”gŒ`ƒf[ƒ^ƒtƒ@ƒCƒ‹(ƒtƒH[ƒ}ƒbƒgŽ©“®”»•Ê)
            # Œ`Ž®FWAV(16bit) ‚Ü‚½‚Í
            #    RAW(16bit(signed short),mono,big-endian)
            #    16kHzˆÈŠO‚̃tƒ@ƒCƒ‹‚Í -smpFreq ‚ÅŽü”g”Žw’è
-filelist listfile.txt    # ”Fޝ‘Ώۃtƒ@ƒCƒ‹‚ÌƒŠƒXƒg
-zmean            # DC¬•ª‚̏œ‹Ž‚ðs‚¤ (-input mfcfileŽž–³Œø)
-rejectshort 100    # Žw’èƒ~ƒŠ•bˆÈ‰º‚Ì’·‚³‚Ì“ü—Í‚ðŠü‹p‚·‚é
-lv 10000        # ƒŒƒxƒ‹‚Ì‚µ‚«‚¢’l (0-32767)
-smpFreq 16000        # ƒTƒ“ƒvƒŠƒ“ƒOŽü”g”(Hz)
-smpPeriod 625    # ƒTƒ“ƒvƒŠƒ“ƒOŽüŠú(ns) (= 10000000 / smpFreq)
-demo            # "-progout -quiet" ‚Æ“¯‚¶

When I run julius by typing:

$julius -C julius.jconf 

There were errors as follows:

STAT: include config: julius.jconf
STAT: jconf successfully finalized
STAT: *** loading AM00 _default
Stat: init_phmm: Reading in HMM definition
Stat: rdhmmdef: ascii format HMM definition
Stat: rdhmmdef: limit check passed
Stat: check_hmm_restriction: an HMM with several arcs from initial state found:
"sp"
Stat: rdhmmdef: this HMM requires multipath handling at decoding
Stat: init_phmm: defined HMMs:    17
Stat: init_phmm: logical names:    46 in HMMList
Stat: init_phmm: base phones:    17 used in logical
Stat: init_phmm: finished reading HMM definitions
STAT: making pseudo bi/mono-phone for IW-triphone
Stat: hmm_lookup: 28 pseudo phones are added to logical HMM list
STAT: *** AM00 _default loaded
STAT: *** loading LM00 _default
Error: voca_load_htkdict: line 1: triphone "uh-ah+sp" not found
Error: voca_load_htkdict: line 1: triphone "ah-sp+*" or biphone "ah-sp" not foun
d
Error: voca_load_htkdict: the line content was: DUA             [DUA]
d uh ah sp
Error: voca_load_htkdict: line 2: triphone "ah-t+sp" not found
Error: voca_load_htkdict: line 2: triphone "t-sp+*" or biphone "t-sp" not found
Error: voca_load_htkdict: the line content was: EMPAT           [EMPAT]
ax m p ah t sp
Error: voca_load_htkdict: line 3: triphone "ah-m+sp" not found
Error: voca_load_htkdict: line 3: triphone "m-sp+*" or biphone "m-sp" not found
Error: voca_load_htkdict: the line content was: ENAM            [ENAM]
ax n ah m sp
Error: voca_load_htkdict: line 4: triphone "oh-ng+sp" not found
Error: voca_load_htkdict: line 4: triphone "ng-sp+*" or biphone "ng-sp" not foun
d
Error: voca_load_htkdict: the line content was: KOSONG          [KOSONG]
k oh sh oh ng sp
Error: voca_load_htkdict: line 5: triphone "m-ah+sp" not found
Error: voca_load_htkdict: line 5: triphone "ah-sp+*" or biphone "ah-sp" not foun
d
Error: voca_load_htkdict: the line content was: LIMA            [LIMA]
l ih m ah sp
Error: voca_load_htkdict: line 6: triphone "t-uh+sp" not found
Error: voca_load_htkdict: line 6: triphone "uh-sp+*" or biphone "uh-sp" not foun
d
Error: voca_load_htkdict: the line content was: SATU            [SATU]
sh ah t uh sp
Error: voca_load_htkdict: line 9: triphone "g-ah+sp" not found
Error: voca_load_htkdict: line 9: triphone "ah-sp+*" or biphone "ah-sp" not foun
d
Error: voca_load_htkdict: the line content was: TIGA            [TIGA]
t ih g ah sp
Error: voca_load_htkdict: begin missing phones
Error: voca_load_htkdict: ah-m+sp
Error: voca_load_htkdict: ah-sp+* or biphone ah-sp
Error: voca_load_htkdict: ah-t+sp
Error: voca_load_htkdict: g-ah+sp
Error: voca_load_htkdict: m-ah+sp
Error: voca_load_htkdict: m-sp+* or biphone m-sp
Error: voca_load_htkdict: ng-sp+* or biphone ng-sp
Error: voca_load_htkdict: oh-ng+sp
Error: voca_load_htkdict: t-sp+* or biphone t-sp
Error: voca_load_htkdict: t-uh+sp
Error: voca_load_htkdict: uh-ah+sp
Error: voca_load_htkdict: uh-sp+* or biphone uh-sp
Error: voca_load_htkdict: end missing phones
Error: init_voca: error in reading dict: 7 words failed out of 2 words
ERROR: m_fusion: failed to read dictionary, terminated
ERROR: m_fusion: failed to initialize dictionary
ERROR: Error in loading model

I knew that those missing phones didn't exist in "triphones1" file, but this file was generated by HTKToolkit. I think this problem occurs because of the content of "dict" file that consists of list of words like this: "DUA             [DUA]           d uh ah sp", which is generated by HTKToolkit and already consists of "sp". Can you help me to solve this problem? Thank you very much.

 

regards,

Amalia Zahra