摘要
针对Apriori算法中I/O负载大和减枝过程中生成大量中间结果两个性能瓶颈问题,提出了一种事务矩阵和项集矩阵的Apriori改进算法。算法的基本思想是:扫描数据库生成事务矩阵,通过事务矩阵和项集矩阵之间的运算代替Apriori算法中的数据库扫描得到频繁项集,减少I/O负载,加快候选项集的验证速度;通过对频繁项集矩阵的操作,减少生成候选频繁项集的数目,避免Apriori算法减枝步骤中对候选项集的分解和判断。通过仿真验证了改进算法的有效性。
Given the bottlenecks of Apriori algorithm,an improved algorithm based on Transaction Matrix and Itemset Matrix was proposed.The fundamental idea of this algorithm is as follows:Transaction Matrix is available by scanning databases only once.Frequent itemsets were obtained via operations on Transaction Matrix and Itemset Matrix,instead of scanning databases repeatedly in Apriori,so that I/O load was reduced,and the verification speed of candidate itemsets was accelerated.By Frequent Itemset Matrix operations,the number of candidate itemsets was reduced,candidate itemsets decomposition and judgment were avoided in pruning step in Apriori.Emulation experiments show the improved algorithm is effectual.
出处
《计算机仿真》
CSCD
北大核心
2013年第8期245-249,共5页
Computer Simulation
基金
陕西省自然基金青年项目(2012JQ1019)
空军工程大学航空航天工程学院科研创新基金(XS1101021)
关键词
关联规则
矩阵
频繁项集
Association rule
Matrix
Frequent itemset