2018年
GENERATING SOUND WORDS FROM AUDIO SIGNALS OF ACOUSTIC EVENTS WITH SEQUENCE-TO-SEQUENCE MODEL
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
- ,
- 開始ページ
- 346
- 終了ページ
- 350
- 記述言語
- 英語
- 掲載種別
- 研究論文(国際会議プロシーディングス)
- DOI
- 10.1109/ICASSP.2018.8462034
- 出版者・発行元
- IEEE
Representing various sounds in language, such as sound words, or onomatopoeias, is not only useful as an auxiliary means for automatic speech recognition, but also essential in emerging fields such as natural human-machine communication, searching audio archives for acoustic events, and abnormality detection based on sounds. This paper proposes a novel method for sound word generation from audio signals. The method is based on an end-to-end, sequence-to-sequence framework to solve the audio segmentation problem to find an appropriate segment of audio signals along time that corresponds to a sequence of phonemes, and the ambiguity problem, where multiple words may correspond to the same sound, depending on the situations or listeners. Our tests show that the method worked efficiently and achieved a 2.8 % mean phoneme error rate (MPER) and a 7.2 % word error rate (WER) in a sound word generation task.
- リンク情報
-
- DOI
- https://doi.org/10.1109/ICASSP.2018.8462034
- DBLP
- https://dblp.uni-trier.de/rec/conf/icassp/IkawaK18
- Web of Science
- https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000446384600069&DestApp=WOS_CPL
- Dblp Cross Ref
- https://dblp.uni-trier.de/conf/icassp/2018
- Dblp Url
- https://dblp.uni-trier.de/db/conf/icassp/icassp2018.html#IkawaK18
- ID情報
-
- DOI : 10.1109/ICASSP.2018.8462034
- DBLP ID : conf/icassp/IkawaK18
- Web of Science ID : WOS:000446384600069