期刊文献+

基于MapReduce的分布式贪心EM算法 被引量:1

Greedy EM algorithm based on MapReduce framework
在线阅读 下载PDF
导出
摘要 针对一种贪心EM算法在处理大规模数据集时收敛速度急剧减慢的问题,提出了一种基于MapReduce的贪心EM算法。该算法首先通过Map(映射)实现数据分发,对每个节点进行处理并生成相应的键值对,然后利用Reduce(归约)将生成的键值对进行整合,最终通过获取最优的高斯混合模型,进而得到模型成分数。通过与传统EM算法、贪心EM算法的运算结果进行比较,实验结果证明该算法在保证准确获取高斯混合模型的模型成分数的前提下,明显地提高了收敛速度。 For the problem that the convergence rate of the existing greedy EM algorithm is drastically slowing down when dealing with largescale data set. In this paper,a greedy EM algorithm based on MapReduce is proposed based on the original greedy EM algorithm. Firstly,the data distribution is carried out through Map( mapping) and each node is processed to generate the corresponding key-value pairs. Then,the key-value of the integration is generated through Reduce( reduction). Finally,the number of model components is got by obtaining the optimal Gaussian mixture model. Compared with the traditional EM algorithm and the greedy EM algorithm,the experimental results show that the algorithm can greatly improve the convergence speed on the basis of ensuring the accurate acquisition of the model component of the Gaussian mixture model.
作者 曹家庆 吴观茂 Cao Jiaqing;Wu Guanmao(School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan 232001,China)
出处 《信息技术与网络安全》 2018年第5期84-87,92,共5页 Information Technology and Network Security
基金 国家自然科学基金(61471004) 安徽理工大学研究生创新基金项目(2017CX2045)
关键词 贪心EM算法 机器学习 数据挖掘 MAPREDUCE框架 greedy EM algorithm machine learning data mining MapReduce framework
  • 相关文献

参考文献15

二级参考文献212

共引文献176

同被引文献9

引证文献1

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部