针对密度峰值聚类算法(Density Peaks Clustering Algorithm, DPC)用传统距离度量方式不能很好地反映数据分布,人为选取截断距离参数主观性较强等问题,设计了一种基于麻雀搜索算法改进的密度峰值聚类算法(Improved Density Peak Cluster...针对密度峰值聚类算法(Density Peaks Clustering Algorithm, DPC)用传统距离度量方式不能很好地反映数据分布,人为选取截断距离参数主观性较强等问题,设计了一种基于麻雀搜索算法改进的密度峰值聚类算法(Improved Density Peak Clustering Algorithm Based on Sparrow Search Algorithm, SSA-DPC)。该算法从两个方面进行改进:改变数据间的距离度量方式,用标准欧氏距离替代原算法中的欧氏距离;利用麻雀搜索算法(Sparrow Search Algorithm, SSA)较强的全局寻优能力,搜寻最佳截断距离值。通过对7个数据集进行仿真测试,证明SSA-DPC算法在3个评价指标上均优于其他聚类算法,提升了聚类性能,说明了算法的有效性。展开更多
The coarse grained(CG)model implements the molecular dynamics simulation by simplifying atom properties and interaction between them.Despite losing certain detailed information,the CG model is still the first-thought ...The coarse grained(CG)model implements the molecular dynamics simulation by simplifying atom properties and interaction between them.Despite losing certain detailed information,the CG model is still the first-thought option to study the large molecule in long time scale with less computing resource.The deep learning model mainly mimics the human studying process to handle the network input as the image to achieve a good classification and regression result.In this work,the TorchMD,a MD framework combining the CG model and deep learning model,is applied to study the protein folding process.In 3D collective variable(CV)space,the modified find density peaks algorithm is applied to cluster the conformations from the TorchMD CG simulation.The center conformation in different states is searched.And the boundary conformations between clusters are assigned.The string algorithm is applied to study the path between two states,which are compared with the end conformations from all atoms simulations.The result shows that the main phenomenon of protein folding with TorchMD CG model is the same as the all-atom simulations,but with a less simulating time scale.The workflow in this work provides another option to study the protein folding and other relative processes with the deep learning CG model.展开更多
文摘针对密度峰值聚类算法(Density Peaks Clustering Algorithm, DPC)用传统距离度量方式不能很好地反映数据分布,人为选取截断距离参数主观性较强等问题,设计了一种基于麻雀搜索算法改进的密度峰值聚类算法(Improved Density Peak Clustering Algorithm Based on Sparrow Search Algorithm, SSA-DPC)。该算法从两个方面进行改进:改变数据间的距离度量方式,用标准欧氏距离替代原算法中的欧氏距离;利用麻雀搜索算法(Sparrow Search Algorithm, SSA)较强的全局寻优能力,搜寻最佳截断距离值。通过对7个数据集进行仿真测试,证明SSA-DPC算法在3个评价指标上均优于其他聚类算法,提升了聚类性能,说明了算法的有效性。
基金supported by the National Natural Science Foundation of China(No.31800615 and No.21933010)。
文摘The coarse grained(CG)model implements the molecular dynamics simulation by simplifying atom properties and interaction between them.Despite losing certain detailed information,the CG model is still the first-thought option to study the large molecule in long time scale with less computing resource.The deep learning model mainly mimics the human studying process to handle the network input as the image to achieve a good classification and regression result.In this work,the TorchMD,a MD framework combining the CG model and deep learning model,is applied to study the protein folding process.In 3D collective variable(CV)space,the modified find density peaks algorithm is applied to cluster the conformations from the TorchMD CG simulation.The center conformation in different states is searched.And the boundary conformations between clusters are assigned.The string algorithm is applied to study the path between two states,which are compared with the end conformations from all atoms simulations.The result shows that the main phenomenon of protein folding with TorchMD CG model is the same as the all-atom simulations,but with a less simulating time scale.The workflow in this work provides another option to study the protein folding and other relative processes with the deep learning CG model.