2008年

Automatic Extraction of Academic Research Information from Higher Education Institution Websites Using Anchor Texts and Link Structures

Educational Technology Research

Yoshiaki Hada
Shinsaku Chikura

巻: 31
号: 1
開始ページ: 143
終了ページ: 151
記述言語: 英語
掲載種別
DOI: 10.15077/etr.KJ00005101204
出版者・発行元: 日本教育工学会

The present study is a part of broader studies aimed at developing a system designed to classify and search for information on the Web so as to benefit university faculty and students in their teaching, research and learning. Through the use of link structures of Web pages, academic research information including useful pages for education, such as research descriptions and lecture notes, was extracted automatically from university Web pages. A new technique was applied for the purpose of automatic extraction, that is, the collection of pages to which links are provided by anchor text from html pages containing a distinctive word, and additional collection of groups of linked pages from the collected pages. More specifically, laboratory Websites were extracted automatically from Websites of the University of Tsukuba with an exceptionally high recall factor and relevance ratio. This extraction method using Web page link structures has been proven to be effective in automatically extracting information where the terms of high appearance rate in the page are not found and therefore it is difficult to implement the automatic extraction of information through natural language processing or where the page structure lacks regularity.

リンク情報

DOI: https://doi.org/10.15077/etr.KJ00005101204
CiNii Articles: http://ci.nii.ac.jp/naid/110007004879
CiNii Books: http://ci.nii.ac.jp/ncid/AA00174437

ID情報

DOI : 10.15077/etr.KJ00005101204
ISSN : 0387-7434
CiNii Articles ID : 110007004879
CiNii Books ID : AA00174437

エクスポート: BibTeX RIS

篠原正典

MISC

Automatic Extraction of Academic Research Information from Higher Education Institution Websites Using Anchor Texts and Link Structures

メニュー