计算机技术与发展2024,Vol.34Issue(5) :170-174.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0056

基于降低数据稀疏度的协同过滤算法

Collaborative Filtering Algorithm Based on Reducing Data Sparsity

徐文涛 王诚
计算机技术与发展2024,Vol.34Issue(5) :170-174.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0056

基于降低数据稀疏度的协同过滤算法

Collaborative Filtering Algorithm Based on Reducing Data Sparsity

徐文涛 1王诚1
扫码查看

作者信息

  • 1. 南京邮电大学 通信与信息工程学院,江苏 南京 210003
  • 折叠

摘要

协同过滤算法是推荐系统的一种常见算法,其核心思想是通过历史数据挖掘用户偏好,计算对象相似近邻项进行推荐.但是一般真实数据都存在严重的数据稀疏性问题,用户或者项目之间的共同评分项目过少,使得一些传统相似度算法计算不准确、推荐准确度不高.传统Slope One算法准确度不高,但其实现简单,运行效率高,可以用做稀疏数据预填充,从而改善相似度计算的准确度.因此,结合Slope One算法,该文提出了一种基于降低数据稀疏度的协同过滤算法.首先对用户评分数据进行分层聚类,再使用Weighted Slope One算法对高稀疏度数据集部分空白数据进行预测填充,从而大幅度降低数据稀疏度,提高了皮尔逊相似度计算的准确度,最后再引入对象属性偏好相似度进行融合.通过MovieLens 100 K数据集进行算法验证,从结果中可以清晰地看出其平均绝对误差(Mean Absolute Error,MAE)有所降低,证明该算法能在一定程度上提升推荐结果的准确度.

Abstract

Collaborative filtering algorithm is a common algorithm in recommendation systems,and its core idea is to mine user preferences through historical data and calculate similar neighbor items of objects for recommendation.However,the general real data has a serious data sparsity,and there are too few common scoring items between users or projects,which makes some traditional similarity al-gorithms inaccurate in calculation and low in recommendation accuracy.The traditional Slope One algorithm is inaccurate,but it has simple implementation and high operation efficiency,which can be used as sparse data pre-filling to improve the accuracy of similarity calculation.Therefore,we introduce a collaborative filtering algorithm based on reducing data sparsity,incorporating the Slope One algorithm.Firstly,hierarchical clustering is performed on the user rating data,and then the Weighted Slope One algorithm is used to predict and fill in some blank data of the high-sparsity dataset,thereby significantly reducing the data sparsity and improving the accuracy of Pearson's similarity calculation.Finally,the object attribute preference similarity is introduced for fusion.Validation is performed using the MovieLens 100 K dataset,and the results clearly show a reduction in the Mean Absolute Error(MAE),indicating an improvement in recommendation accuracy.It is validated that the proposed algorithm can enhance recommendation accuracy to some extent.

关键词

协同过滤/数据稀疏度/加权Slope/One/皮尔逊相似度/对象属性

Key words

collaborative filtering/data sparsity/Weighted Slope One/Pearson similarity/object properties

引用本文复制引用

基金项目

国家自然科学基金(61801240)

出版年

2024
计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
参考文献量18
段落导航相关论文