MISC

2008年

Automatic Extraction of Academic Research Information from Higher Education Institution Websites Using Anchor Texts and Link Structures

Educational Technology Research
  • Yoshiaki Hada
  • ,
  • Shinsaku Chikura

31
1
開始ページ
143
終了ページ
151
記述言語
英語
掲載種別
DOI
10.15077/etr.KJ00005101204
出版者・発行元
日本教育工学会

The present study is a part of broader studies aimed at developing a system designed to classify and search for information on the Web so as to benefit university faculty and students in their teaching, research and learning. Through the use of link structures of Web pages, academic research information including useful pages for education, such as research descriptions and lecture notes, was extracted automatically from university Web pages. A new technique was applied for the purpose of automatic extraction, that is, the collection of pages to which links are provided by anchor text from html pages containing a distinctive word, and additional collection of groups of linked pages from the collected pages. More specifically, laboratory Websites were extracted automatically from Websites of the University of Tsukuba with an exceptionally high recall factor and relevance ratio. This extraction method using Web page link structures has been proven to be effective in automatically extracting information where the terms of high appearance rate in the page are not found and therefore it is difficult to implement the automatic extraction of information through natural language processing or where the page structure lacks regularity.

リンク情報
DOI
https://doi.org/10.15077/etr.KJ00005101204
CiNii Articles
http://ci.nii.ac.jp/naid/110007004879
CiNii Books
http://ci.nii.ac.jp/ncid/AA00174437
ID情報
  • DOI : 10.15077/etr.KJ00005101204
  • ISSN : 0387-7434
  • CiNii Articles ID : 110007004879
  • CiNii Books ID : AA00174437

エクスポート
BibTeX RIS