Acoustic Model Discussions

Flat
Re: Filler words in transcripts
User: nsh
Date: 8/16/2009 9:39 pm
Views: 200
Rating: 1

If your language model have both "komsu" and "+larini" the spoken word "komsularini" will be successfully recognized as "komsu +larini"

--- (Edited on 8/16/2009 9:39 pm [GMT-0500] by nsh) ---

Re: Filler words in transcripts
User: ercani
Date: 8/29/2009 7:50 am
Views: 81
Rating: 1

Thanks for your explanation.

I got one question in my mind after your explanation about it.

 My transcript file to obtain Acoustic Model has "komsularini" word. Do I need to parse it by space in my transcript file as:

<s> komsu larini </s>     

to use it in my language model in this way "komsu +larini"   ?

 I mean that  if I parse the word in my language model then should I also parse it in my transcript file for acoustic ?

--- (Edited on 8/29/2009 7:50 am [GMT-0500] by ercani) ---

Re: Filler words in transcripts
User: nsh
Date: 8/29/2009 10:22 am
Views: 106
Rating: 1

Yes, it's better to split transcript words as well. For example it will be useful to get the word error rate.

--- (Edited on 8/29/2009 10:22 am [GMT-0500] by nsh) ---

Re: Filler words in transcripts
User: ercani
Date: 8/29/2009 10:39 am
Views: 100
Rating: 1

what do you mean "For example it will be useful to get the word error rate." ?

while I train the acoustic model as

<s> komsu larini </s>   "with space" normally, there is no space in writing of this word

it will be more efficient to recognize the parsed word:

"komsu +larini"  in the language model. Do you mean that is ?

--- (Edited on 8/29/2009 10:39 am [GMT-0500] by ercani) ---

Re: Filler words in transcripts
User: nsh
Date: 8/29/2009 11:41 am
Views: 77
Rating: 1

With subword language model the output of the engine will be also in subword units:

<s> komsu +larini </s>


to use word error rate tools you need to compare it with a reference prompts:

<s> komsu +larini </s>

It will not work if word will be joined. The same is true for the dictionary which should have subwords to let trainer detect them. Subword dictionary could be used both for decoding and training then.

--- (Edited on 8/29/2009 11:41 am [GMT-0500] by nsh) ---

Re: Filler words in transcripts
User: ercani
Date: 8/29/2009 11:54 am
Views: 129
Rating: 1

thanks for your reply

If I parse the word to obtain acoustic model, then I am afraid I will have spurious words when recognizing sentences.

I can keep the sub units in the vocabulary with other words, but it may bad  effects on recognizing wholeword ?

--- (Edited on 8/29/2009 11:54 am [GMT-0500] by ercani) ---

Re: Filler words in transcripts
User: ercani
Date: 9/4/2009 4:02 pm
Views: 72
Rating: 1

 

Hi,

I got error while I was building feature file. It is related with phonems as follows:

Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
WARNING: This phone () occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (SIL) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (a) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (aa) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (b) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (c) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (ch) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (d) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (e) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (ea) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (f) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (g) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (gh) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (h) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (i) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (ii) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (iy) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (j) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (k) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (l) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (m) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (n) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (o) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (oe) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (p) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (r) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (rh) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (s) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (sh) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (t) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (u) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (ue) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (ug) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (uu) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (v) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (y) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (z) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)
WARNING: This phone (zh) occurs in the phonelist (/home/kapil/Work/ercan/myam/etc/myam.phone), but not in any word in the transcription (/home/kapil/Work/ercan/myam/etc/myam_train.transcription)

-------------

My phone list as follows:

a
aa
b
c
ch
d
e
ea
f
g
gh
h
i
ii
iy
j
k
l
m
n
o
oe
p
r
rh
s
sh
t
u
ue
ug
uu
v
y
z
zh
SIL
----------

my transcriptions is follows:

<s> üç iki dört altı yedi sekiz dokuz bir </s> (1)


<s> müzik çal ömer danış gök yüzünde kuş olsam seni görür inerdim </s> (119)


<s> müzik çal ramazan garip ses yanlızlığım </s> (124)

As you see there are some letters in turkish alphabet as: ç - ü - ğ - ş - ı - ö

But those letters are not available in my phonem list because it is not ascii. This may causethose errors ? If so, what do you reccomend me about usage of  "ç - ü - ğ - ş - ı - ö" ?

meanwhile my dictionary file is for above transcript as follows:

müzik    m ue z i k

çal    ch a l

ramazan    r a m a z a n

ömer    oe m e r

danış    d a n iy sh

görür   g oe r ue r

I have uploaded those files to:

http://rapidshare.com/files/275702188/etc.rar.html

Please let me know your comments about it.

--- (Edited on 9/4/2009 4:02 pm [GMT-0500] by ercani) ---

--- (Edited on 9/4/2009 4:16 pm [GMT-0500] by ercani) ---

Re: Filler words in transcripts
User: nsh
Date: 9/4/2009 4:15 pm
Views: 81
Rating: 1

Since all phones from your phoneset are missing, most likely it's not related to UTF-8 chars. I suspect your transcription file has incorrect format, for example there is space after closing brace in the end.

 

--- (Edited on 9/4/2009 4:15 pm [GMT-0500] by nsh) ---

Re: Filler words in transcripts
User: ercani
Date: 9/4/2009 4:17 pm
Views: 66
Rating: 1

I uploaded my files here :

http://rapidshare.com/files/275702188/etc.rar.html

could you check it?

--- (Edited on 9/4/2009 4:17 pm [GMT-0500] by ercani) ---

Re: Filler words in transcripts
User: nsh
Date: 9/4/2009 4:47 pm
Views: 66
Rating: 2

All your files are UTF-16 with CR+LF, CR terminators. They must be UTF-8 with only LF.

Training on Windows is like shooting yourself in a leg. Moreover, many iteresting things aren't available in Window. I urgely recommend you to install Linux.

--- (Edited on 9/4/2009 4:47 pm [GMT-0500] by nsh) ---

PreviousNext