Speech Recognition in the News

Facebook AI - Multilingual LibriSpeech (MLS)
User: kmaclean
Date: 7/16/2021 11:02 am
from the Facebook AI website:

Facebook AI is releasing Multilingual LibriSpeech (MLS), a large-scale, open source data set designed to help advance research in automatic speech recognition (ASR).

MLS provides more than 50,000 hours of audio across eight languages: English, German, Dutch, French, Spanish, Italian, Portuguese, and Polish. It also provides language-model training data and pretrained language models along with baselines to help researchers compare different ASR systems. Because it leverages public domain audiobooks from the LibriVox project, MLS offers a large data set with a broad range of different speakers, and it can be released with a nonrestrictive license.

MLS is available on OpenSL: