論文

査読有り
2016年4月4日

A bootstrapping method for extracting attribute names with keys from the web

Proceedings of the ACM Symposium on Applied Computing
  • Yoshinori Hijikata
  • ,
  • Shintaro Nomura
  • ,
  • Fumitaka Nakane
  • ,
  • Shogo Nishida

04-08-
開始ページ
368
終了ページ
371
記述言語
英語
掲載種別
研究論文(国際会議プロシーディングス)
DOI
10.1145/2851613.2851992
出版者・発行元
Association for Computing Machinery

A large number of semi-structured documents (HTML documents) exist on the Web. To improve the accessibility of information related to an object, such as a product, human, or place, its attribute names and attribute values need to be extracted. Recently, many studies have aimed to extract attribute values automatically, whereas only a small number of studies have attempted to extract attribute names. Thus, we propose a method for extracting attribute names automatically. For the first time, we apply a bootstrapping algorithm to attribute name extraction in the area of information extraction. To solve a problem that is caused by applying a pure bootstrapping algorithm to attribute name extraction, we use keys, which are also extracted by a bootstrapping algorithm. We found that using extracted keys improve the accuracy of attribute name extraction.

リンク情報
DOI
https://doi.org/10.1145/2851613.2851992
ID情報
  • DOI : 10.1145/2851613.2851992
  • SCOPUS ID : 84975865370

エクスポート
BibTeX RIS