Bangla cleaned speech corpus, specially developed for Bangla Text to Speech
Bangla cleaned speech corpus, specially developed for Bangla Text to Speech back in 2009. It is orginally hosted in sourceforge.
This dataset consists of three different corpora and those were developed for three different purposes.
Other characterstics include:
Due to the size of the corpora (4.4GB) we uploaded data on mendeley and also kept the data on sourceforge.
Option 1:
Please follow mendeley page.
Option 2: sourceforge.
Firoj Alam, S. M. Murtoza Habib, Dil Afroza Sultana and Mumit Khan, Development of Annotated Bangla Speech Corpora, Spoken Language Technologies for Under-resourced language (SLTU’10), vol 1, pp-35-41, Penang, Malasia, May 3 - 5, 2010.paper
@inproceedings{alam2010development,
title={Development of annotated Bangla speech corpora},
author={Alam, Firoj and Habib, SM Murtoza and Sultana, Dil Afroza and Khan, Mumit},
booktitle={Spoken Languages Technologies for Under-Resourced Languages},
year={2010}
}