2012年6月
Co-occurrence-based indicators for authorship analysis
LITERARY AND LINGUISTIC COMPUTING
- ,
- ,
- ,
- ,
- 巻
- 27
- 号
- 2
- 開始ページ
- 197
- 終了ページ
- 214
- 記述言語
- 英語
- 掲載種別
- 研究論文(学術雑誌)
- DOI
- 10.1093/llc/fqs011
- 出版者・発行元
- OXFORD UNIV PRESS
Along with its methodological development, authorship analysis has expanded in scope to new application areas like authorship profiling and computational sociolinguistics as well as conventional ones like authorship attribution. For these new applications, providing a new interpretation of text through the textual characteristics is as important as improving the classification performance between the authors, which was the aim in conventional applications. Lexical indicators were one of the most frequently used characteristics in conventional applications as they were effective at discriminating between authors, but most of the previously used indicators were based on the frequencies of morphemes, and reflected only limited aspects of the writing styles of the authors. In order to use these types of characteristics for new applications, we need to develop indicators reflecting other various aspects of the authors writing styles that are useful for interpretation as well as classification. As such, we propose the use of two types of co-occurrence-based indicators, namely network indicators and a co-occurrence-based concentration indicator in this field. Experimental results using the Aozora Bunko corpora, along with qualitative analyses, showed that our indicators were very effective at capturing aspects of the styles of the authors as well as for improving the classification performance. We concluded that our indicators successfully supplement previously used indicators and are useful for various new applications in authorship analysis.
- リンク情報
- ID情報
-
- DOI : 10.1093/llc/fqs011
- ISSN : 0268-1145
- Web of Science ID : WOS:000304199900007