1999年
Mining generalized association rule parallel RDB engine on PC cluster
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- ,
- ,
- ,
- 巻
- 1676
- 号
- 開始ページ
- 281
- 終了ページ
- 292
- 記述言語
- 英語
- 掲載種別
- 研究論文(国際会議プロシーディングス)
- DOI
- 10.1007/3-540-48298-9_30
- 出版者・発行元
- Springer Verlag
Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. One of data mining techniques, generalized association rule mining with taxonomy, is potential to discover more useful knowledge than ordinary at association mining by taking application specific information into account. We proposed SQL queries, named TTR-SQL and TH-SQL to perform this kind of mining and evaluated them on PC cluster. Those queries can be more than 30% faster than Apriori based SQL query reported previously. Although RDBMS has powerful query processing ability through SQL, most data mining systems use specialized implementations to achieve better performance. There is a tradeoff between performance and portability. Performance is not necessarily sufficiently high but seamless integration with existing RDBMS would be considerably advantageous. Since RDB is already very popular, the feasibility of generalized association rule mining can be explored using the proposed SQL query instead of purchasing expensive mining software. In addition, parallel RDB is now also widely accepted. We showed that paralleling the SQL execution can offer the same performance with those native programs with 10 to 15 nodes. Since most organizations have a lot of PCs, which are not fully utilized. We are able to exploit such resources to explore the performance significantly.
- ID情報
-
- DOI : 10.1007/3-540-48298-9_30
- ISSN : 1611-3349
- ISSN : 0302-9743
- SCOPUS ID : 84876370440