DNN-Based Speech Recognition for Globalphone Languages
by , , ,
Abstract:
This paper describes new reference benchmark results based on hybrid Hidden Markov Model and Deep Neural Networks (HMM-DNN) for the GlobalPhone (GP) multilingual text and speech database. GP is a multilingual database of high-quality read speech with corresponding transcriptions and pronunciation dictionaries in more than 20 languages. Moreover, we provide new results for five additional languages, namely, Amharic, Oromo, Tigrigna, Wolaytta, and Uyghur. Across the 22 languages considered, the hybrid HMM-DNN models outperform the HMM-GMM based models regardless of the size of the training speech used. Overall, we achieved relative improvements that range from 7.14% to 59.43%.
Reference:
DNN-Based Speech Recognition for Globalphone Languages (M. Y. Tachbelie, A. Abulimiti, S. T. Abate, T. Schultz), In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volume , 2020.
Bibtex Entry:
@INPROCEEDINGS{9053144,
  author={M. Y. {Tachbelie} and A. {Abulimiti} and S. T. {Abate} and T. {Schultz}},
  booktitle={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title={DNN-Based Speech Recognition for Globalphone Languages},
  year={2020},
  volume={},
  number={},
  pages={8269-8273},
  abstract={This paper describes new reference benchmark results based on hybrid Hidden Markov Model and Deep Neural Networks (HMM-DNN) for the GlobalPhone (GP) multilingual text and speech database. GP is a multilingual database of high-quality read speech with corresponding transcriptions and pronunciation dictionaries in more than 20 languages. Moreover, we provide new results for five additional languages, namely, Amharic, Oromo, Tigrigna, Wolaytta, and Uyghur. Across the 22 languages considered, the hybrid HMM-DNN models outperform the HMM-GMM based models regardless of the size of the training speech used. Overall, we achieved relative improvements that range from 7.14% to 59.43%.},
  keywords={GlobalPhone;DNN;Ethiopian Languages},
  doi={10.1109/ICASSP40776.2020.9053144},
  ISSN={2379-190X},
  month={May},
  url = "https://www.csl.uni-bremen.de/cms/images/documents/publications/martha_ICASSP2020.pdf",
}