論文

査読有り 国際誌
2020年11月

A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction.

Findings of the Association for Computational Linguistics: EMNLP 2020
  • Masato Mita
  • ,
  • Shun Kiyono
  • ,
  • Masahiro Kaneko
  • ,
  • Jun Suzuki
  • ,
  • Kentaro Inui

開始ページ
267
終了ページ
280
記述言語
英語
掲載種別
DOI
10.18653/v1/2020.findings-emnlp.26
出版者・発行元
Association for Computational Linguistics

Existing approaches for grammatical error correction (GEC) largely rely on
supervised learning with manually created GEC datasets. However, there has been
little focus on verifying and ensuring the quality of the datasets, and on how
lower-quality data might affect GEC performance. We indeed found that there is
a non-negligible amount of "noise" where errors were inappropriately edited or
left uncorrected. To address this, we designed a self-refinement method where
the key idea is to denoise these datasets by leveraging the prediction
consistency of existing models, and outperformed strong denoising baseline
methods. We further applied task-specific techniques and achieved
state-of-the-art performance on the CoNLL-2014, JFLEG, and BEA-2019 benchmarks.
We then analyzed the effect of the proposed denoising method, and found that
our approach leads to improved coverage of corrections and facilitated fluency
edits which are reflected in higher recall and overall performance.

リンク情報
DOI
https://doi.org/10.18653/v1/2020.findings-emnlp.26
DBLP
https://dblp.uni-trier.de/rec/conf/emnlp/MitaKKSI20
arXiv
http://arxiv.org/abs/arXiv:2010.03155
URL
https://www.aclweb.org/anthology/2020.findings-emnlp.26/
URL
https://dblp.uni-trier.de/conf/emnlp/2020f
URL
https://dblp.uni-trier.de/db/conf/emnlp/emnlp2020f.html#MitaKKSI20
ID情報
  • DOI : 10.18653/v1/2020.findings-emnlp.26
  • DBLP ID : conf/emnlp/MitaKKSI20
  • arXiv ID : arXiv:2010.03155

エクスポート
BibTeX RIS