Misc.

Jan 19, 2009

Extraction of Noun Synonyms and Other Related Words Using Dense-Subclusters

IEICE technical report
  • KANEMOTO Masaya
  • ,
  • TAKEUCHI Koichi

Volume
108
Number
408
First page
31
Last page
35
Language
Japanese
Publishing type
Publisher
The Institute of Electronics, Information and Communication Engineers

In this paper we propose a noun clustering approach on the basis of CBC proposed by Pantel. CBC is a clustering approach that carefully extracts clusters by finding sub-clusters regarded as committees with the same meanings, and try to extract unknown clusters from the remaining elements. In preliminary experiments of Japanese noun clustering, however, we found that CBC does not work well at the measurement of basic similarity between words with context vectors and scoring method that decides to merge sub-clusters. To these problems in this paper we propose to apply Jensen-Shannon formula as a measurement and a new scoring method. In the experimental results of constructing sub-clusters of Japanese nouns from a new paper article we will show that our proposed approaches overcome the approaches in CBC at the clustering accuracy.

Link information
CiNii Articles
http://ci.nii.ac.jp/naid/110007138259
CiNii Books
http://ci.nii.ac.jp/ncid/AN10091225
URL
http://id.ndl.go.jp/bib/9794021
ID information
  • ISSN : 0913-5685
  • CiNii Articles ID : 110007138259
  • CiNii Books ID : AN10091225

Export
BibTeX RIS