I would like to get your input on the following.
As far as I understand, the GPL license doesn't allow non-free derivative works out of a work licensed under GPL.
However, I believe that it is impossible to know whether an acoustic model has been compiled out of voxforge's audio corpora. Basically, the creation of an acoustic model requires:
pre-processing -> feature vector extraction -> classification
For example, an Hidden Markov model is composed of state transition probabilities and of pairs of means and variances for the observation probability distributions...
There's no way to be sure that an acoustic model comes from voxforge's audio corpora and the commercial product in question will never have to ship the audio corpora, only the acoustic models... In this respect, how do you control that a commercial product doesn't use your corpora?
Thanks for your input
Mathieu
--- (Edited on 5/28/2008 9:25 am [GMT-0500] by Visitor) ---
Hi Mathieu,
>commercial product in question will never have to ship the audio corpora, only
>the acoustic models... In this respect, how do you control that a commercial
>product doesn't use your corpora?
We will likely have to rely on evidence that is not contained in the acoustic model itself, like disgruntled employees...
I would welcome any input on possible technical solutions to this problem.
Thanks,
Ken
--- (Edited on 5/28/2008 1:45 pm [GMT-0400] by kmaclean) ---
--- (Edited on 5/28/2008 10:43 pm [GMT-0500] by Visitor) ---
A related question:
I'm currently building models that contain both voxforge- and non-voxforge audio. Is this allowed at all?
I'll happily release the models, I was just too lazy to do so yet (also because they're not really usable for anything).
Cheers, Timo
--- (Edited on 2008-06-11 13:59 [GMT+0200] by timobaumann) ---
Hi Timo,
The general rule is that you can *use* a GPL'ed work (like VoxForge speech audio & transcription texts) any way you like. However, the moment you *distribute* a GPL'ed work, or derivative works thereof, then the GPL license requires that the *entire* work be distributed under the GPL.
Therefore, in your particular case, the GPL does not prevent you from creating a 'binary' acoustic model from VoxForge and non-Voxforge 'source' audio, and *using* it for your own purposes.
However, if you decide to *distribute* this acoustic model, it is covered by the GPL, and you must make available all the 'source' (VoxForge and non-VoxForge audio, and texts, ...) which was used to create the 'binary' acoustic model.
The FSF FAQ has some information that is helpful with respect to how you might distribute a large corpus of 'source' audio:
Can I put the binaries on my Internet server and put the source on a different Internet site?
The GPL says you must offer access to copy the source code "from the same place"; that is, next to the binaries. However, if you make arrangements with another site to keep the necessary source code available, and put a link or cross-reference to the source code next to the binaries, we think that qualifies as "from the same place".
...
Ken
p.s. I am not a lawyer, and this is not a legal opinion
--- (Edited on 6/11/2008 2:38 pm [GMT-0400] by kmaclean) ---
Hi Ken,
thanks for that advice, I would have probably done a bad thing by releasing the models then. (It's a shame because the combined model works far better than any of the sub-corpus models, but if that's what the license says...) Well, let's hope that I will be able to reach the same performance with only-voxforge models someday.
Cheers, Timo
--- (Edited on 2008-06-12 08:29 [GMT+0200] by timobaumann) ---