首页|融合差分进化的网页暗链集成分类检测方法

融合差分进化的网页暗链集成分类检测方法

扫码查看
"暗链"也称黑链,是一种在网站中不易被搜索引擎察觉的链接,其通过隐蔽植入高权重的网站外链扰乱搜索引擎排名,破坏网络环境.它和友情链接有着相似之处,虽然可以有效并快速提高网站的PR值,但是在网站中存在一定的风险性.针对目前网页暗链检测方法中特征集合存在冗余和维数灾难的状况,提出一种基于融合差分进化算法的集成分类器的机器学习网页暗链检测方法.对提取到的初始特征集合首先进行过滤式特征选择,其次通过主成分分析法对特征进行二次提取,最后对决策树、随机森林、AdaBoost以及支持向量机四种分类器利用差分进化方法进行投票集成.实验结果表明,上述方法具有较高的准确度和可靠性,正确识别率达99.8442368%,可为搜索引擎检测暗链行为提供有力的实践支撑.
Integrated Classification and Detection Method of Webpage Hidden Hyperlink Based on Differential Evolution
Hidden hyperlink,also known as black hyperlink,is a kind of link in website that is not easily detected by search engine.It disrupts search engine rankings and damages the web environment by embedding high-weight ex-ternal links.It has similarities with friendship links,although it can effectively and quickly incrase the PR value of website effectively,there is a certain level of risk in the website.Aiming at the current situation of redundant feature sets and curse of dimensionality in webpage hidden hyperlink detection methods,a machine learning webpage hidden hyperlink detection method based on fusion differential evolution algorithm integrated classifier is proposed.First,filter feature selection was performed on the extracted initial feature set,and then the features were extracted twice by principal component analysis.Finally,the four classifiers of decision tree,random forest,AdaBoost and support vector machine were integrated by differential evolution method.The experimental results show that the method has high ac-curacy and reliability,and the correct recognition rate reaches 99.8442368%,which can provide strong practical sup-port for search engines to detect hidden hyperlink behavior.

Hidden hyperlinkFeature selectionMachine learningRandom forestSVMDifferential evolution

张紫妍、韩斌、姜元昊、陈紫薇

展开 >

江苏科技大学计算机学院,江苏镇江 212003

江苏科技大学机器学习与软件新技术研究所,江苏镇江 212003

景德镇陶瓷大学信息工程学院,江西景德镇 333403

暗链 特征选择 机器学习 随机森林 支持向量机 差分进化

国家自然科学基金江苏省研究生创新训练项目

62176107KYCX21_3487

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(4)
  • 14