Efficient Probabilistic Latent Semantic Analysis through Parallelization

INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS

Raymond Wan
Vo Ngoc Anh
Hiroshi Mamitsuka

巻: 5839
号
開始ページ: 432
終了ページ: +
記述言語: 英語
掲載種別: 研究論文（国際会議プロシーディングス）
DOI: 10.1007/978-3-642-04769-5_38
出版者・発行元: SPRINGER-VERLAG BERLIN

Probabilistic latent semantic analysis (PLSA) is considered an effective technique for information retrieval, but has one notable drawback: its dramatic consumption of computing resources, in terms of both execution time and internal memory. This drawback limits the practical application of the technique only to document collections of modest size.
In this paper, we look into the practice of implementing PLSA with the aim of improving its efficiency without changing its output. Recently, Hong et al. [2008] has shown how the execution time of PLSA can be improved by employing OpenMP for shared memory parallelization. We extend their work by also studying the effects from using it in combination with the Message Passing Interface (MPI) for distributed memory parallelization. We show how a more careful implementation of PLSA reduces execution time and memory costs by applying our method on several text collections commonly used in the literature.

リンク情報

DOI: https://doi.org/10.1007/978-3-642-04769-5_38
Web of Science: https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000273954100038&DestApp=WOS_CPL

ID情報

DOI : 10.1007/978-3-642-04769-5_38
ISSN : 0302-9743
Web of Science ID : WOS:000273954100038

エクスポート: BibTeX RIS

馬見塚拓

論文

Efficient Probabilistic Latent Semantic Analysis through Parallelization

メニュー