摘要
将两种传统基于内存的协同过滤方法相结合,提出一种基于数据的GitHub项目个性化混合推荐方法.该方法不仅可动态地计算相似用户以保证推荐的个性化,且只用很小规模的相似用户便可得到与基于项目的方法相近的推荐质量;同时,该方法通过建立倒排表和利用K均值分类,在一定程度上解决了原方法在面对GitHub用户及项目数量级较大但交叉度较低的数据集时数据稀疏和冷启动问题.通过与传统方法进行对比实验,验证了该方法的有效性和优越性.
We combined the traditional two memory-based collaborative-filtering methods and proposed a data-based personalized mixed recommendation method for GitHub projects.The method could not only calculate the similar users dynamically to ensure the personalized recommendation,but also obtain the recommendation quality comparable to the item-based method with only small scale of similar users.At the same time,the method solved the data sparsity and cold boot problems of the original method in the face of GitHub,a data set of users and projects of an order of magnitude but with low degree of crossover to some extent by establishing inverse table and using K-means classification.By comparing with the traditional method,we verified the effectiveness and superiority of the proposed method.
作者
何锴琦
马宇骁
张炎
刘华虓
HE Kaiqi;MA Yuxiao;ZHANG Yan;LIU Huaxiao(School of Graduate,Jilin University,Changchun 130012,China;College of Engineering,Northeastern University,Boston 02115,USA;College of Computer Science and Technology,Jilin University,Changchun 130012,China)
出处
《吉林大学学报(理学版)》
CAS
北大核心
2020年第6期1399-1406,共8页
Journal of Jilin University:Science Edition
基金
吉林省自然科学基金(批准号:20190201193JC)。
关键词
数据分析
推荐系统
协同过滤技术
冷启动
data analysis
recommendation system
collaborative-filtering technology
cold boot