2019年10月
Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
- ,
- ,
- ,
- 巻
- 27
- 号
- 10
- 開始ページ
- 1535
- 終了ページ
- 1548
- 記述言語
- 英語
- 掲載種別
- 研究論文(学術雑誌)
- DOI
- 10.1109/TASLP.2019.2923951
- 出版者・発行元
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
In emotional voice conversion (VC) tasks, it is difficult to deal with a simple representation of fundamental frequency (F0), which is the most important feature in emotional voice representation. In order to address this issue, we propose the adaptive scales continuous wavelet transform (ADS-CWT) method to systematically capture F0 features of different temporal levels, which can represent different prosodic aspects, ranging from micro-prosody to sentences. Moreover, in an emotional VC task, each dataset is paired with the labeled emotional voice and neutral voice, which can be regarded as a dual task. Owing to, first, dual supervised learning's ability to improve the training performances by using the leveraging probabilistic connection between the dual tasks to enhance the learning from labeled data and, second, generative adversarial networks' (GANs') ability to mitigate the over-smoothing problem caused in the low-level data space when converting the acoustic features, we further present a novel training framework for emotional VC using GANs combined with dual supervised learning, named as dual supervised adversarial networks. In emotional VC experiments, we confirmed the high similarity performance of our method when using limited labeled data for emotional VC. Our method achieves good and consistent performance, in both objective and subjective evaluations.
- リンク情報
-
- DOI
- https://doi.org/10.1109/TASLP.2019.2923951
- DBLP
- https://dblp.uni-trier.de/rec/journals/taslp/LuoCTA19
- Web of Science
- https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000473621000004&DestApp=WOS_CPL
- Dblp Url
- https://dblp.uni-trier.de/db/journals/taslp/taslp27.html#LuoCTA19
- URL
- https://publons.com/wos-op/publon/36111242/
- ID情報
-
- DOI : 10.1109/TASLP.2019.2923951
- ISSN : 2329-9290
- eISSN : 2329-9304
- DBLP ID : journals/taslp/LuoCTA19
- ORCIDのPut Code : 135826040
- Web of Science ID : WOS:000473621000004