2019年11月
Multi-lingual transformer training for khmer automatic speech recognition
2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
- ,
- ,
- ,
- 開始ページ
- 1893
- 終了ページ
- 1896
- 記述言語
- 掲載種別
- 研究論文(国際会議プロシーディングス)
- DOI
- 10.1109/APSIPAASC47483.2019.9023137
© 2019 IEEE. Currently, there are three challenges for constructing reliable ASR systems for the Khmer language: (1) the lack of language resources (text and speech corpora) in digital form, (2) the writing system without explicit word boundary, and (3) the pronunciation model is not well studied. In this paper, to avoid the extensive work on selecting proper acoustic units (e.g., phones, syllables) and preparing the frame-level labels on the traditional DNN-HMM framework, we directly use words or characters as the label using state-of-the-art transformer-based end-to-end model. Moreover, we use the multi-lingual training framework to tackle the low-resource data problem. All experiments are performed on the Basic Expressions Travel Corpus (BTEC) datasets. The experiments show that the proposed multi-lingual transformer-based end-to-end model can achieve significant improvement compared to the DNN-HMM baseline model11The work was performed during Mr. Kak Soky was in NIPTICT. He is currently with Ministry of Education, Youth, and Sports (MoEYS), Cambodia.
- リンク情報
- ID情報
-
- DOI : 10.1109/APSIPAASC47483.2019.9023137
- SCOPUS ID : 85082391236