首页|关于Word2Vec文本分类效果若干影响因素的分析

关于Word2Vec文本分类效果若干影响因素的分析

扫码查看
Word2Vec向量模型参数众多,在不同情景下分类效果不一,分析其影响因素很有必要.从Word2Vec模型基本原理出发,分析讨论了预训练语料、词向量预训练参数以及分类模型参数三大因素对模型分类效果的影响.结果表明限定域预料效果好于广域预料;预训练参数中向量维度越大,效果越好,窗口大小存在最优值,分类算法影响不大;分类模型参数中学习率、激活函数、批次大小对模型分类效果影响较大,训练轮次相对较小.
Analysis of Several Influencing Factors on Word2Vec Text Classification Effect
The Word2Vec vector model has numerous parameters,and its classification effect varies in different scenarios.It is necessary to analyze its influencing factors.Starting from the basic principles of the Word2Vec model,this paper analyzes and discusses the impact of three major factors of pre trained corpus,pre trained parameters of word vectors,and classification model parameters on the model's classification effect.The results indicate that the effect of limited domain prediction is better than that of wide domain prediction.And the larger the vector dimension in the pre trained parameters,the better the effect.There is an optimal value in window size,and the classification algorithm has little impact.The learning rate,activation function and batch size of the classification model parameters have a greater impact on the classification effect of the model,and the training round is relatively small.

Word2Vectext classificationmodel effectinfluencing factor

谢庆恒

展开 >

国家图书馆,北京 100081

Word2Vec 文本分类 模型效果 影响因素

2024

现代信息科技
广东省电子学会

现代信息科技

ISSN:2096-4706
年,卷(期):2024.8(1)
  • 1
  • 5