摘要
提出一种基于偏差信息准则(deriance information criterion,DIC)的鲁棒贝叶斯混合分布模型选择算法.在变分逼近框架下,给出鲁棒贝叶斯混合模型的DIC计算公式;设计的模型选择算法能同时学习模型参数推断和进行模型选择,避免在大的候选模型集中根据模型选择准则选取最优模型.给出试验参数初始值设置方法,在含有较多离群点的仿真数据和Old Faithful Geyser数据上的试验结果表明了好的性能:得到鲁棒的混合分量参数和较准确的混合分量个数.
Bayesian approaches to robust mixture modelling based on Student-t distributions enable to be less sensitive to outliers, thereby preventing from over-estimating of the number of mixting components. However, there are two intractable problems in the previous methods for model selection under the variational Bayesian framework:(1) The variational approach converges to a local maximum of the low bound on the log-evidence that dependents on the initial parameter values. How can the variational approach guarantee that the initial settings for different models are consistency? (2) The low bound is sensitive to factorized approximation forms in the inference process. How can the variational approach guarantee that the approximate errors for different models are equivalent? In this paper, we present a model selection algorithm for robust bayesian mixture distributions based on deviance information criterion(D/C) proposed by Spiegelhalter et al. in 2002. Unlike the Bayesian Infromation Criterion (BIC), the DIC is straightforward in calculation, which has been adopted in many modern applications. Inspired by the works of MeGrory et al. , which used the DIC values for model selection tasks of finite mixture Gaussian distributions and hidden Markov models, the calculation of a DIC for robust Bayesian mixture model is derived. The proposed algorithm can learn model parameters and perform model selection simultaneously, which avoids choosing an optimum one among a large set of candidate models. A method to initialize parameters of the algorithm is provided. Experimental results on simulated data and Old Faithful Geyser data containing a large amount of outliers show the good performance that the algorithm can learn parameters of mixture components robustly and the number of components precisely.
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2009年第5期689-698,共10页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(60674089)
上海市重点学科基金(B504)
关键词
混合模型
变分学习
偏差信息准则
模型选择
鲁
棒
mixture model, variational learning, deviance information criterion, model selection, robust