查詢結果分析
來源資料
頁籤選單縮合
題 名 | 幾個快速挖掘關聯規則的資料探勘方法=Several Improved Data Mining Algorithms for Finding Association Rules |
---|---|
作 者 | 陳彥良; 趙書榮; 陳禹辰; | 書刊名 | 電子商務學報 |
卷 期 | 5:2 2003.09[民92.09] |
頁 次 | 頁1-10 |
分類號 | 312.13 |
關鍵詞 | 資料挖掘; 關聯規則; 交易資料庫; Data mining; Association rule; Transaction database; |
語 文 | 中文(Chinese) |
中文摘要 | 關聯規則的挖掘,是目前最重要的資料挖掘問題之一,它的目的是要從銷售的交易資料庫中,發現商品項目間的關聯。在過去已經有相當多挖掘關聯規則演算法被提出來,當中FP-tree演算法可說是最主要的演算法之一,並以高執行效率著稱。它的主要概念是不產生candidate itemsets,而將資料庫壓縮在FP-tree的結構中以避免多次的高成本資料庫掃瞄。在本文中,我們針對原本的FP tree演算法,更進一步改進其所用的資料結構以提高挖掘效率。 在本文中共建立了三種資料結構: (一) FP-tree_tail演算法,也就是在head table中增加一個tail欄位,(二) FP-treel hash演算法,乃是以hash function計算出每個node所在位置方式建立FP-tree, (三) FP-treel hash+tail演算法,為結合 (一)、(二) 之優點,所完成之演算法。作者將以上三個演算法與傳統FP_tree演算法一起比較以找出各演算法之優缺點。經由各種實驗數據發現,傳統FP_tree演算法所需花費之時間,為三個改良FP-tree演算法的數十倍。 |
英文摘要 | Mining association rules is one of the most important problems in data mining. Its aim is to discover the associations between items in a large database of sales transactions. In the past, a large number of algorithms for mining association rules have been proposed, and the FP-tree algorithm is one of the most famous ones, known for its efficiency. Unlike the traditional approach that requires many phases of candidate itemsets generation and database scan, the FP-tree algorithm compresses and stores the entire database into a sophisticated tree structure, called FP-tree, by which all the associations can be found by two database scans. In this paper, we attempt to further improve the standard structure of the FP-tree such that the mining performance can be improved. To this end, three variants of the improved FP-tree algorithm are proposed. The first variant is called FP-tree+tail, which adds a tail pointer into the head table of the original FP-tree structure. The second is named as FP-tree hash, which adds a hash table into every node of the FP-tree. Finally, we call the last FP-treel1ash+tail, which is a combination of the first two improvements. Finally, a performance evaluation is done to compare their performances. The result indicates that the three proposed algorithms are about 20-50 times faster than the original FP-tree algorithm. |
本系統中英文摘要資訊取自各篇刊載內容。