数字印刷2024,Issue(6) :110-116.DOI:10.19370/j.cnki.cn10-1886/ts.2024.06.014

基于改进TF-IDF算法的用户画像构建方法研究

Research on User Profile Construction Method Based on Improved TF-IDF Algorithm

邵泽明 李宇昂 杨可 王国鹏 刘兴国 陈瀚宁 司占军
数字印刷2024,Issue(6) :110-116.DOI:10.19370/j.cnki.cn10-1886/ts.2024.06.014

基于改进TF-IDF算法的用户画像构建方法研究

Research on User Profile Construction Method Based on Improved TF-IDF Algorithm

邵泽明 1李宇昂 1杨可 1王国鹏 2刘兴国 2陈瀚宁 1司占军1
扫码查看

作者信息

  • 1. 天津科技大学人工智能学院,天津 300457
  • 2. 国家开放大学,北京 100039;数字化学习技术集成与应用教育部工程研究中心,北京 100039
  • 折叠

摘要

在互联网和商业环境的数据驱动时代,构建准确的用户画像对于个性化用户的理解和分类至关重要.传统的TF-IDF算法在评估单词对分类结果的影响时,存在一些局限性.因此,本研究引入了一种改进的TF-IDF-K算法,其中包含均衡因子,旨在通过处理和分析用户搜索记录来构建用户画像.通过支持向量机(SVM)的训练和预测功能,预测用户的人口属性.实验结果表明,TF-IDF-K算法在分类准确性和可靠性方面取得了显著的提升.

Abstract

In the data-driven era of the internet and business environments,constructing accurate user profiles is paramount for personalized user understanding and classification.The traditional TF-IDF algorithm has some limitations when evaluating the impact of words on classification results.Consequently,an improved TF-IDF-K algorithm was introduced in this study,which included an equalization factor,aimed at constructing user profiles by processing and analyzing user search records.Through the training and prediction capabilities of a Support Vector Machine (SVM),it enabled the prediction of user demographic attributes.The experimental results demonstrated that the TF-IDF-K algorithm has achieved a significant improvement in classification accuracy and reliability.

关键词

TF-IDF-K算法/用户画像/均衡因子/SVM

Key words

TF-IDF-K algorithm/User profiling/Equalization factor/SVM

引用本文复制引用

出版年

2024
数字印刷
中国印刷科学技术研究所

数字印刷

北大核心
ISSN:2095-9540
段落导航相关论文