Multi-lingual transformer training for khmer automatic speech recognition

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019

Kak Soky
Sheng Li
Tatsuya Kawahara
Sopheap Seng

開始ページ: 1893
終了ページ: 1896
記述言語
掲載種別: 研究論文（国際会議プロシーディングス）
DOI: 10.1109/APSIPAASC47483.2019.9023137

© 2019 IEEE. Currently, there are three challenges for constructing reliable ASR systems for the Khmer language: (1) the lack of language resources (text and speech corpora) in digital form, (2) the writing system without explicit word boundary, and (3) the pronunciation model is not well studied. In this paper, to avoid the extensive work on selecting proper acoustic units (e.g., phones, syllables) and preparing the frame-level labels on the traditional DNN-HMM framework, we directly use words or characters as the label using state-of-the-art transformer-based end-to-end model. Moreover, we use the multi-lingual training framework to tackle the low-resource data problem. All experiments are performed on the Basic Expressions Travel Corpus (BTEC) datasets. The experiments show that the proposed multi-lingual transformer-based end-to-end model can achieve significant improvement compared to the DNN-HMM baseline model11The work was performed during Mr. Kak Soky was in NIPTICT. He is currently with Ministry of Education, Youth, and Sports (MoEYS), Cambodia.

リンク情報

DOI: https://doi.org/10.1109/APSIPAASC47483.2019.9023137
Scopus: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85082391236&origin=inward
Scopus Citedby: https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85082391236&origin=inward

ID情報

DOI : 10.1109/APSIPAASC47483.2019.9023137
SCOPUS ID : 85082391236

エクスポート: BibTeX RIS

李勝

論文

Multi-lingual transformer training for khmer automatic speech recognition

メニュー

共著者の一覧

フォロー一覧