Click here to register.

General Discussion

Flat
Log Phenome?
User: Tom J.
Date: 11/18/2020 11:17 am
Views: 490
Rating: 0

I'm currently working with Julius. An initial thought I had was to log the phenomes as heard by Julius so I could spot problems with my dialect vs the model.

It just dawned on me that my AI really doesn't need to have human readable responses trained. If there were a way to use Julius or another speech recognition tool for that matter to log the phenomes then the API I've written could return that as a string.

My AI could have a limitless vocabulary and without the step of passing through the dictionary the returned string should be very fast.

Now I'm wondering if simply writing a .voca file with just the individual phoenetics as matches rather than words as matches to phoenetic groups  would achieve this.

Anybody ever explore this particular rabbit hole?

--- (Edited on 11/18/2020 11:17 am [GMT-0600] by Tom J.) ---

Re: Log Phenome?
User: kmaclean
Date: 11/18/2020 11:36 am
Views: 13
Rating: 1

take a look at wav2letter - which started out as a way of predicting letters directly from the raw waveform

 

--- (Edited on 11/18/2020 12:36 pm [GMT-0500] by kmaclean) ---

Re: Log Phenome?
User: Tom J.
Date: 11/20/2020 10:57 am
Views: 29
Rating: 0

Thank you kmaclean,

I made a direct phenome to phenome dictionary and it works but Julius never appears to hear the same thing twice without triphones to match and there are no distinct spaces between words since every phenome is considered a word.

In all I'm glad I conducted the experiment.

I'm going to take the wav2letter suggestion next and play with it but I can't help but wonder if Julius doesn't have options in the jconf to set the spacing of words. If I could do an additional space between words it would be a matter of iterating the string in c++ and eliminating single spaces only.

Then perhaps write a dictionary that's more or less matching chunks of words.

--- (Edited on 11/20/2020 10:57 am [GMT-0600] by Tom J.) ---

--- (Edited on 11/20/2020 10:59 am [GMT-0600] by Tom J.) ---

Re: Log Phenome?
User: kmaclean
Date: 12/1/2020 9:51 am
Views: 2
Rating: 0

> If I could do an additional space between words it would be a matter of iterating the string in c++ and eliminating single spaces only.

>Then perhaps write a dictionary that's more or less matching chunks of words.

I'm still not clear on what you are trying to do... but Grammar files can be designed to return as long a string as you want.

--- (Edited on 12/1/2020 10:51 am [GMT-0500] by kmaclean) ---

Re: Log Phenome?
User: Tom J.
Date: 12/2/2020 10:51 am
Views: 58
Rating: 0

>I'm still not clear on what you are trying to do...

It's a little robot with an AI and I have several more planned. I've got Julius running on a SBC and now I'm working through the challenges of fine tuning - getting a thorough dictionary that's as lean as I can make it, building an API, and making a language model with HTK. The tiny computer is unable to run HTK as it's an arm processor so I'm building the LM on a big computer.

Right now I'm experimenting with different ways to return strings with Julius and sorting through the big voxforge phenome file to locate words for my dictionary.

I really wish there was an h file for C++ to pause/start Julius and return strings but I haven't found one so I'm writing one, and it's very crude as I'm relatively new to writing code.

I've found Julius server and tested it so maybe there's something in there I can use, it's obvious I'm not breaking new ground here because the sample voca is literally for using Julius in this way "phone steve" and such...

I did have a couple short verbal conversions with it already but I've still got a long road to travel.

--- (Edited on 12/2/2020 10:51 am [GMT-0600] by Tom J.) ---

Re: Log Phenome?
User: Tom J.
Date: 1/3/2021 2:22 pm
Views: 78
Rating: 0

Almost done assembling my .voca file and I had a quick question about order.

If Julius is looking for a match and finds it will it move on or keep looking at the rest of the words?

I'm asking because I got a list of the 100 most frequently spoken words in English and moved those words to the top of the .voca file, if it's beneficial I can get the top 1000 but it's a good deal of work to do if it's to no avail.

 

--- (Edited on 1/3/2021 2:22 pm [GMT-0600] by Tom J.) ---

PreviousNext