Warning: Too long input(>320000 samples)

Comments

User: Sudipto
Date: 10/1/2016 2:00 pm

Views: 3694
Rating: 1

when it is <<please speak>> I try to record "phone steve".I don't know what is happening.it just shows the warning "Too long input(>320000 samples)" in the output.Then again <<please speak>> message are shown.

STAT: jconf successfully finalized

STAT: *** loading AM00 _default

Stat: init_phmm: Reading in HMM definition

Stat: rdhmmdef: ascii format HMM definition

Stat: rdhmmdef: limit check passed

Stat: check_hmm_restriction: an HMM with several arcs from initial state found: "sp"

Stat: rdhmmdef: this HMM requires multipath handling at decoding

Stat: rdhmmdef: no <SID> embedded

Stat: rdhmmdef: assign SID by the order of appearance

Stat: init_phmm: defined HMMs: 810

Stat: init_phmm: loading ascii hmmlist

Stat: init_phmm: logical names: 24402 in HMMList

Stat: init_phmm: base phones: 41 used in logical

Stat: init_phmm: finished reading HMM definitions

STAT: m_fusion: force multipath HMM handling by user request

STAT: making pseudo bi/mono-phone for IW-triphone

Stat: hmm_lookup: 799 pseudo phones are added to logical HMM list

STAT: *** AM00 _default loaded

STAT: *** loading LM00 _default

STAT: reading [sample.dfa] and [sample.dict]...

Stat: init_voca: read 18 words

STAT: done

STAT: Gram #0 sample registered

STAT: Gram #0 sample: new grammar loaded, now mash it up for recognition

STAT: Gram #0 sample: extracting category-pair constraint for the 1st pass

STAT: Gram #0 sample: installed

STAT: Gram #0 sample: turn on active

STAT: grammar update completed

STAT: *** LM00 _default loaded

STAT: ------

STAT: All models are ready, go for final fusion

STAT: [1] create MFCC extraction instance(s)

STAT: *** create MFCC calculation modules from AM

STAT: AM 0 _default: create a new module MFCC01

STAT: 1 MFCC modules created

STAT: [2] create recognition processing instance(s) with AM and LM

STAT: composing recognizer instance SR00 _default (AM00 _default, LM00 _default)

STAT: Building HMM lexicon tree

STAT: lexicon size: 207 nodes

STAT: coordination check passed

STAT: multi-gram: beam width set to 200 (guess) by lexicon change

STAT: wchmm (re)build completed

STAT: SR00 _default composed

STAT: [3] initialize for acoustic HMM calculation

Stat: outprob_init: state-level mixture PDFs, use calc_mix()

Stat: addlog: generating addlog table (size = 1953 kB)

Stat: addlog: addlog table generated

STAT: [4] prepare MFCC storage(s)

STAT: [5] prepare for real-time decoding

STAT: All init successfully done

STAT: ###### initialize input device

----------------------- System Information begin ---------------------

JuliusLib rev.4.3.1 (fast)

Engine specification:

- Base setup : fast

- Supported LM : DFA, N-gram, Word

- Extension :

- Compiled by : gcc -g -O2

------------------------------------------------------------

Configuration of Modules

Number of defined modules: AM=1, LM=1, SR=1

Acoustic Model (with input parameter spec.):

- AM00 "_default"

hmmfilename=hmm15/hmmdefs

hmmmapfilename=tiedlist

Language Model:

- LM00 "_default"

grammar #1:

dfa = sample.dfa

dict = sample.dict

Recognizer:

- SR00 "_default" (AM00, LM00)

------------------------------------------------------------

Speech Analysis Module(s)

[MFCC01] for [AM00 _default]

Acoustic analysis condition:

parameter = MFCC_0_D_N_Z (25 dim. from 12 cepstrum + c0, abs energy supressed with CMN)

sample frequency = 16000 Hz

sample period = 625 (1 = 100ns)

window size = 400 samples (25.0 ms)

frame shift = 160 samples (10.0 ms)

pre-emphasis = 0.97

# filterbank = 24

cepst. lifter = 22

raw energy = False

energy normalize = False

delta window = 2 frames (20.0 ms) around

hi freq cut = OFF

lo freq cut = OFF

zero mean frame = OFF

use power = OFF

CVN = OFF

VTLN = OFF

spectral subtraction = off

cep. mean normalization = yes, real-time MAP-CMN, updating mean with last 0.0 sec. input

initial mean from file = N/A

beginning data weight = 100.00

cep. var. normalization = no

base setup from = Julius defaults

------------------------------------------------------------

Acoustic Model(s)

[AM00 "_default"]

HMM Info:

810 models, 126 states, 126 mpdfs, 126 Gaussians are defined

model type = context dependency handling ON

training parameter = MFCC_N_D_Z_0

vector length = 25

number of stream = 1

stream info = [0-24]

cov. matrix type = DIAGC

duration type = NULLD

max mixture size = 1 Gaussians

max length of model = 5 states

logical base phones = 41

model skip trans. = exist, require multi-path handling

skippable models = sp (1 model(s))

AM Parameters:

Gaussian pruning = safe (-gprune)

top N mixtures to calc = 2 / 0 (-tmix)

short pause HMM name = "sp" specified, "sp" applied (physical) (-sp)

cross-word CD on pass1 = handle by approx. (use max. prob. of same LC)

sp transition penalty = -70.0

------------------------------------------------------------

Language Model(s)

[LM00 "_default"] type=grammar

DFA grammar info:

6 nodes, 6 arcs, 6 terminal(category) symbols

category-pair matrix: 32 bytes (712 bytes allocated)

Vocabulary Info:

vocabulary size = 18 words, 51 models

average word len = 2.8 models, 8.5 states

maximum state num = 15 nodes per word

transparent words = not exist

words under class = not exist

Parameters:

found sp category IDs =

------------------------------------------------------------

Recognizer(s)

[SR00 "_default"] AM00 "_default" + LM00 "_default"

Lexicon tree:

total node num = 207

root node num = 18

leaf node num = 18

(-penalty1) IW penalty1 = +5.0

(-penalty2) IW penalty2 = +20.0

(-cmalpha)CM alpha coef = 0.050000

inter-word short pause = on (append "sp" for each word tail)

sp transition penalty = -70.0

Search parameters:

multi-path handling = yes, multi-path mode enabled

(-b) trellis beam width = 200 (-1 or not specified - guessed)

(-bs)score pruning thres= disabled

(-n)search candidate num= 1

(-s) search stack size = 500

(-m) search overflow = after 2000 hypothesis poped

2nd pass method = searching sentence, generating N-best

(-b2) pass2 beam width = 200

(-lookuprange)lookup range= 5 (tm-5 <= t <tm+5)

(-sb)2nd scan beamthres = 200.0 (in logscore)

(-n) search till = 1 candidates found

(-output) and output = 1 candidates out of above

IWCD handling:

1st pass: approximation (use max. prob. of same LC)

2nd pass: loose (apply when hypo. is popped and scanned)

all possible words will be expanded in 2nd pass

build_wchmm2() used

lcdset limited by word-pair constraint

short pause segmentation = off

fall back on search fail = off, returns search failure

------------------------------------------------------------

Decoding algorithm:

1st pass input processing = real time, on-the-fly

1st pass method = 1-best approx. generating indexed trellis

output word confidence measure based on search-time scores

------------------------------------------------------------

FrontEnd:

Input stream:

input type = waveform

input source = microphone

device API = default

sampling freq. = 16000 Hz

threaded A/D-in = supported, on

zero frames stripping = on

silence cutting = on

level thres = 4000 / 32767

zerocross thres = 60 / sec.

head margin = 300 msec.

tail margin = 400 msec.

chunk size = 1000 samples

long-term DC removal = off

level scaling factor = 1.00 (disabled)

reject short input = off

reject long input = off

----------------------- System Information end -----------------------

Notice for feature extraction (01),

*************************************************************

* Cepstral mean normalization for real-time decoding: *

* NOTICE: The first input may not be recognized, since *

* no initial mean is available on startup. *

*************************************************************

Stat: adin_oss: device name = /dev/dsp1 (from AUDIODEV)

Stat: adin_oss: sampling rate = 16000Hz

Stat: adin_oss: going to set latency to 50 msec

Stat: adin_oss: audio I/O Latency = 32 msec (fragment size = 512 samples)

STAT: AD-in thread created

WARNING: adin_thread_process: too long input (> 320000 samples), segmented now

Warning: input buffer overflow: some input may be dropped, so disgard the input

Re: Warning: Too long input(>320000 samples)

User: colbec
Date: 10/2/2016 6:47 am

Views: 3
Rating: 0

See Tony Robinson's answer to this question at http://www.voxforge.org/home/forums/message-boards/speech-recognition-engines/help-julius-doesnt-recognize-anything-and-give-warning/1 ; it might give you some ideas.

Previous • Next •


Username	Password