論文

1999年

Mining generalized association rule parallel RDB engine on PC cluster

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
  • Iko Pramudiono
  • ,
  • Takahiko Shintani
  • ,
  • Takayuki Tamura
  • ,
  • Masaru Kitsuregawa

1676
開始ページ
281
終了ページ
292
記述言語
英語
掲載種別
研究論文(国際会議プロシーディングス)
DOI
10.1007/3-540-48298-9_30
出版者・発行元
Springer Verlag

Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. One of data mining techniques, generalized association rule mining with taxonomy, is potential to discover more useful knowledge than ordinary at association mining by taking application specific information into account. We proposed SQL queries, named TTR-SQL and TH-SQL to perform this kind of mining and evaluated them on PC cluster. Those queries can be more than 30% faster than Apriori based SQL query reported previously. Although RDBMS has powerful query processing ability through SQL, most data mining systems use specialized implementations to achieve better performance. There is a tradeoff between performance and portability. Performance is not necessarily sufficiently high but seamless integration with existing RDBMS would be considerably advantageous. Since RDB is already very popular, the feasibility of generalized association rule mining can be explored using the proposed SQL query instead of purchasing expensive mining software. In addition, parallel RDB is now also widely accepted. We showed that paralleling the SQL execution can offer the same performance with those native programs with 10 to 15 nodes. Since most organizations have a lot of PCs, which are not fully utilized. We are able to exploit such resources to explore the performance significantly.

リンク情報
DOI
https://doi.org/10.1007/3-540-48298-9_30
ID情報
  • DOI : 10.1007/3-540-48298-9_30
  • ISSN : 1611-3349
  • ISSN : 0302-9743
  • SCOPUS ID : 84876370440

エクスポート
BibTeX RIS