論文

査読有り
2019年11月

Multi-lingual transformer training for khmer automatic speech recognition

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
  • Kak Soky
  • ,
  • Sheng Li
  • ,
  • Tatsuya Kawahara
  • ,
  • Sopheap Seng

開始ページ
1893
終了ページ
1896
記述言語
掲載種別
研究論文(国際会議プロシーディングス)
DOI
10.1109/APSIPAASC47483.2019.9023137

© 2019 IEEE. Currently, there are three challenges for constructing reliable ASR systems for the Khmer language: (1) the lack of language resources (text and speech corpora) in digital form, (2) the writing system without explicit word boundary, and (3) the pronunciation model is not well studied. In this paper, to avoid the extensive work on selecting proper acoustic units (e.g., phones, syllables) and preparing the frame-level labels on the traditional DNN-HMM framework, we directly use words or characters as the label using state-of-the-art transformer-based end-to-end model. Moreover, we use the multi-lingual training framework to tackle the low-resource data problem. All experiments are performed on the Basic Expressions Travel Corpus (BTEC) datasets. The experiments show that the proposed multi-lingual transformer-based end-to-end model can achieve significant improvement compared to the DNN-HMM baseline model11The work was performed during Mr. Kak Soky was in NIPTICT. He is currently with Ministry of Education, Youth, and Sports (MoEYS), Cambodia.

リンク情報
DOI
https://doi.org/10.1109/APSIPAASC47483.2019.9023137
Scopus
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85082391236&origin=inward
Scopus Citedby
https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85082391236&origin=inward
ID情報
  • DOI : 10.1109/APSIPAASC47483.2019.9023137
  • SCOPUS ID : 85082391236

エクスポート
BibTeX RIS