How to deal with Out-Of-Vocabulary (OOV) Words

General Discussion

Flat

User: Nick
Date: 4/15/2010 9:25 am

Views: 7704
Rating: 2

Hi all,

has someone any experience with how to deal with out-of-vocabulary (OOV) words? As I know there are two possible ways to manage that problem:

1) Use the confidence score as a criterion: this can be done over the likelihood ration (see this paper: http://www.icis.ntu.edu.sg/scs-ijit/117/117_15.pdf "The Confidence Measure for Isolated Word Recognition System") but where in the Julius output can I find the appropriate likelihoods?

2) Creating "Garbage" models out of utterances which do not belong to the vocabulary of the ASR-System. For this I need to create recordings of many other utterances and roughly group them into different length and so on. The disadvantage here is that too sensitive garbage models can cause false alarms.

Has anybody a good advice how to solve the OOV problem wiht Julius?
I manly need this for isolated word recognition.

Regards, Niko

--- (Edited on 4/15/2010 9:25 am [GMT-0500] by Visitor) ---

Re: How to deal with Out-Of-Vocabulary (OOV) Words

User: kmaclean
Date: 4/19/2010 12:47 pm

Views: 3487
Rating: 3

>where in the Julius output can I find the appropriate likelihoods?

See this thread: For Noisy Input

Simon now uses Julius confidence scoring.

Ken

--- (Edited on 4/19/2010 1:47 pm [GMT-0400] by kmaclean) ---

--- (Edited on 4/19/2010 2:23 pm [GMT-0400] by kmaclean) ---

Previous • Next •


Username	Password