期刊文献+

一种高效的多类型数据挖掘算法 被引量:10

An Efficiency Mining Algorithm for Multiple Class Data
在线阅读 下载PDF
导出
摘要 已有大部分挖掘算法基本是针对两类数据来发现对照模式以挖掘所需信息,但是针对多类型数据发现对照模式的数据挖掘仍面临挑战。关联规则挖掘算法的缺陷是因为要生成大量规则,然而这其中却包含较多的冗余规则,非冗余规则挖掘算法尽管去除了冗余规则,然而有些规则针对特定应用领域的数据兴趣度太低,所以文中给出一种高效的多类型数据挖掘算法。所给算法根据统计方法定义了诱因模式与安全模式,并实现在多类医疗数据中发现所定义的两种模式。仿真实验给出多类医疗数据的直观因果关系图,且由所给算法生成的规则所获得的分类器证实了所给算法的高效性与实用性。所给算法生成的规则提供了精确且非常有用的信息,能够在诸如医疗研究领域中实际应用。 The contrast pattern which basically aiming to two types of data is found to gain required message,but it is great challenge that to find contrast pattern in existing multiple class data to carry out data mining. The limitation of the association rules in data mining algorithm is that the association rules need to generate lots of rules,and many of this rules are redundant rules. However,while the non-redundant rules of data mining algorithm has wiped the redundant rules,but there are still kinds of rules have low interest degree in certain specific application field. Thus,an effective mining algorithm for multiple class data is presented. The pathogenic pattern and protect pattern are defined based on statistical method,and the novel algorithm is realized to find the two patterns in multiple class medical data. Meanwhile,a clearly causal graph is drawn according to the simulated experiment,and the classifier of the novel rules generated by the presented algorithm also verified the efficiency and practicability of the novel algorithm. So the rules generated by the presented algorithm provided accurate and useful message,and could be applied actually in medical research fields.
作者 张新英 付川南 ZHANG Xin-ying FU Chuan-nan(College of Information and Business, Zhongyuan University of Technology, Zhengzhou 451191, China)
出处 《中国电子科学研究院学报》 北大核心 2017年第4期359-364,共6页 Journal of China Academy of Electronics and Information Technology
基金 河南省重点科技攻关项目(152102210155) 河南省高等学校重点科研项目(17A413014) 中原工学院信息商务学院院级科研项目(ky1615)
关键词 数据挖掘 多类型数据 优化规则 兴趣度 data mining multiple class data optimize rules odd ratio
  • 相关文献

参考文献3

二级参考文献143

  • 1张文修,梁广锡,梁怡.包含度及其在人工智能中的应用[J].西安交通大学学报,1995,29(8):111-116. 被引量:10
  • 2苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:391
  • 3张文修,徐宗本,梁怡,梁广锡.包含度理论[J].模糊系统与数学,1996,10(4):1-9. 被引量:49
  • 4刘远超,王晓龙,徐志明,刘秉权.基于粗集理论的中文关键词短语构成规则挖掘[J].电子学报,2007,35(2):371-374. 被引量:17
  • 5Matthew R B, Luo Jie-bo, Shen Xi-peng, et al. Learning multi-la- bel scene classification[J]. Pattern Recognition, 2004(37) : 1757- 1771.
  • 6Zhang Min-ling, Zhou Zhi-hua. MI-kNN: A lazy learning ap- proach to multi-label learning [J]. Pattern Recognition, 2007 (40) : 2038-2048.
  • 7Xu Xin-shun, Jiang Yuan, Peng Liang, et al. Ensemble approach based on conditional random field for multi-label image and video annotation[C]//Proceedings of the 19th ACM international conference on Multimedia. Scottsdale, Arizona, USA, 2011: 1377-1380.
  • 8Wang Jing-dong, Zhao Ying-hai, Wu Xiu-qing, et al. A transduc- tive multi-label learning approach for video concept detection [J]. Pattern Recognition, 2011(44) : 10-11.
  • 9Snoek C, Worring M, Gemert J V, et al. The challenge problem for automated detection of 101semantic concepts in multimedia [C]//Proceedings of the ACM Multimedia. Santa Barbara, CA, USA, 2006 : 421-430.
  • 10Clare A,King R. Knowledge discovery in multblabel phenotype data[C]//Proceedings of the 5th European Conference on Prin- ciples of Data Mining and Knowledge Discovery (PKDD). Freiburg, Germany, 2001 : 42-53.

共引文献764

同被引文献79

引证文献10

二级引证文献50

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部