摘要
针对混合属性数据聚类问题,本文提出一种基于多目标多元学习细菌觅食优化算法。首先,基于改进的细菌觅食优化算法,提出多目标优化算法框架。然后,提出多元学习策略来提高算法性能。具体地,对于细菌个体,细菌之间采用环形拓扑学习策略,每个细菌只能向其邻域最优个体学习;细菌个体还可以向外部档案非支配个体学习。通过该学习策略,不仅可以保持种群的多样性,也可以加快算法收敛速度。对于外部档案非支配个体,记录其变化趋势,当非支配个体的变化处于停滞状态时,采用精英学习策略对非支配个体进行微扰动,提高非支配解的多样性。最后,为解决混合属性数据聚类问题,设计了一种具有属性权重的混合属性转换策略。为了验证所提算法的性能,将该算法与两个多目标进化算法和三个经典聚类算法在六个标准数据集上进行对比实验。实验结果表明,所提算法在解决数值、分类和混合属性数据聚类问题上具有显著优势。同时,以金融领域信用卡申请客户数据为例,进一步证实了所提算法的可行性,也表明了所提算法在涉及混合属性数据集的医疗、管理、工程等领域有一定的应用前景。
With the easy generation and acquisition of data in medical,management,financial,and other fields,a large amount of data with mixed attributes is generated.How to mine valuable information from these kinds of data has attracted the attention of researchers.Clustering is one of the famous data mining methods,which can be employed to find information from the mixed attribute data sets.Various mixed-type data clustering methods have been designed,which can be divided into general clustering algorithms and evolutionary computation-based clustering algorithms.Among them,the evolutionary computation-based clustering algorithms mainly include single-objective or multi-objective optimization algorithms.These proposed algorithms show good performance under the specific context.However,when facing automatic clustering,high dimensional clustering,and multi-objective clustering problems,the algorithms in the first category cannot get satisfying clustering results;on the contrary,the algorithms in the second category show great potential.Therefore,the researchers have conducted in-depth research on the algorithms in the second category.When using the evolutionary computation-based clustering algorithms,two issues need to be taken into consideration further.On the one hand,these algorithms are proposed based on the K-prototype.It is well recognized that K-prototype employs the Hamming distance to compute the similarity of categorical attributes so that it cannot show the true relations between data samples.On the other hand,these algorithms mainly focus on the genetic algorithm,other evolutionary computation-based algorithms,such as bacterial foraging optimization algorithm,are worth studying in solving mixed-type data clustering problems.In this paper,to solve the mixed data clustering problem and improve the performance of evolutionary computation-based clustering algorithms,a multi-objective multi-learning bacterial foraging optimization algorithm(MMBFOC) is proposed.Firstly,based on the improved bacterial foraging optimization algorithm,a multi-objective type is constructed.Secondly,multi-learning strategies are designed to enhance the performance of the proposed algorithm.Specifically,for bacterial individuals,the ring topology learning strategy is adopted among bacteria,so that each bacterium can learn from the optimal individual in its neighbourhood;bacterial individuals can also learn from non-dominated individuals of external archive.This learning strategy can not only maintain the diversity of the population but also accelerate the convergence rate of the algorithm.For non-dominated individuals,record the changing trend of them.When the change of the non-dominated individuals is in a stagnant state,the elite learning strategy is used to perturb them to increase the diversity of the non-dominated solution.Finally,an attribute conversion strategy with attribute weights is designed to solve the mixed data clustering problem.To verify the performance of the proposed algorithm,the proposed algorithm is compared with two multi-objective evolutionary algorithms(MOPSOC and NSGA2 C) and three classical clustering algorithms(K-means,K-modes,and K-prototype) on six standard data sets selected from UCI.Experimental results show that the proposed algorithm has significant advantages in solving numeric,categorical,and mixed attribute data clustering problems than its competitors.Meanwhile,taking customer data of credit card application in the financial field as an example,the analysis results show that compared with other algorithms,MMBFOC can better classify the customers.This also proves that the proposed algorithm has broad application prospects in medical,management,and engineering fields.
作者
牛奔
郭晨
唐恒
NIU Ben;GUO Chen;TANG Heng(College of Management,Shenzhen University,Shenzhen 518060,China;Institute of Big Data Intelligent Management and Decision,Shenzhen University,Shenzhen 518060,China;Faculty of Business Administration,University of Macao,Macao 999078,China)
出处
《中国管理科学》
CSSCI
CSCD
北大核心
2022年第12期131-140,共10页
Chinese Journal of Management Science
基金
国家自然科学基金资助项目(71971143)
国家自然科学基金资助重大研究计划(91846301)
国家自然科学基金资助重大项目(71790615)
澳门大学(MYRG2018-00051-FBA)
广东省自然科学基金资助项目(2020A1515010749)
广东省教育局高等教育重点研究基金资助项目(2019KZDXM030)。
关键词
混合属性数据聚类
细菌觅食优化算法
多目标优化
多元学习策略
mixed attribute data clustering
bacterial foraging optimization algorithm
multi-objective optimization
multi-learning strategy