has someone any experience with how to deal with out-of-vocabulary (OOV) words? As I know there are two possible ways to manage that problem:
1) Use the confidence score as a criterion: this can be done over the likelihood ration (see this paper: http://www.icis.ntu.edu.sg/scs-ijit/117/117_15.pdf "The Confidence Measure for Isolated Word Recognition System") but where in the Julius output can I find the appropriate likelihoods?
2) Creating "Garbage" models out of utterances which do not belong to the vocabulary of the ASR-System. For this I need to create recordings of many other utterances and roughly group them into different length and so on. The disadvantage here is that too sensitive garbage models can cause false alarms.
Has anybody a good advice how to solve the OOV problem wiht Julius?
I manly need this for isolated word recognition.
--- (Edited on 4/15/2010 9:25 am [GMT-0500] by Visitor) ---
>where in the Julius output can I find the appropriate likelihoods?
See this thread: For Noisy Input
Simon now uses Julius confidence scoring.
--- (Edited on 4/19/2010 1:47 pm [GMT-0400] by kmaclean) ---
--- (Edited on 4/19/2010 2:23 pm [GMT-0400] by kmaclean) ---