针对传统的两阶段高模糊效用挖掘算法存在产生大量候选项集、忽略项集之间的联系和需要重复扫描数据库的问题,提出了一阶段基于模糊列表的相关高模糊效用挖掘算法(Correlated High Fuzzy Utility Mining Algorithm Based on Fuzzy List,CoHFUIM).算法设计了新的模糊列表结构(FHUI-list),使挖掘过程仅需扫描一次数据库,提高了运行效率;上述算法增加了相关性约束并提出了Cos-prune剪枝策略,减少了候选项集的数量,使挖掘出的项集既是高效用的也是高相关的;为了使上述算法适用于动态数据库,提出了改进算法CoHFUIM+.在Chess、Connect和Mushroom三个真实数据集进行仿真,结果表明改进算法的运行时间、内存使用及延展性均优于经典算法TPFU.
Correlated High Fuzzy Utility Mining Algorithm Based on Fuzzy List
This paper proposes a one-phase correlated high fuzzy utility mining algorithm based on fuzzy list(Co-HFUIM)to address the problem that the traditional two-phase high fuzzy utility mining algorithm produces a large number of candidate itemsets,ignores the relationship between itemsets,and requires scanning the database repeatedly.The algorithm designs a new structure of fuzzy list(FHUI-list)to scan the database only once in the min-ing process,which improves the operation efficiency.The algorithm adds the correlation constraint and the proposed Cos-prune strategy,which reduces the number of candidate itemsets and makes the mined algorithm both efficient and highly correlated.An improved algorithm CoHFUIM + is proposed to apply to dynamic database.Simulation experiments on three real datasets of Chess,Connect and Mushroom show that the algorithm is superior to TPFU in running time,memory usage and ductility.