首页|Investigators from National Institute of Technology Raipur Zero in on Robotics ( Exploiting Web Content Semantic Features To Detect Web Robots From Weblogs)
Investigators from National Institute of Technology Raipur Zero in on Robotics ( Exploiting Web Content Semantic Features To Detect Web Robots From Weblogs)
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
2024 OCT 03 (NewsRx)-By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News-Research findings on Robotics are disc ussed in a new report. According to news reporting from Raipur, India, by NewsRx journalists, research stated, "Nowadays, web robots are predominantly used for auto-accessing web content, sharing almost one-third of the total web traffic an d often posing threats to various web applications' security, privacy, and perfo rmance. Detecting these robots is essential, and both online and offline methods are employed." The news correspondents obtained a quote from the research from the National Ins titute of Technology Raipur, "One popular offline method is the use of weblog fe ature-based automated learning. However, this method alone cannot accurately ide ntify web robots that continuously evolve and camouflage. Web content features c ombined with weblog features are used to detect such robots based on the assumpt ion that human users exhibit specific interests while robots randomly navigate w eb pages. State-of-the-art web content-based feature methods lack the ability to generate coherent topics, which can confound the performance of classification models. Therefore, we propose a new content semantic feature extraction method t hat uses the LDA2Vec topic model, combining the strengths of LDA and the Word2Ve c model to produce more semantically coherent topics by exploiting website conte nt for a web session. To effectively detect web robots, web resource content sem antic features are combined with log- based features in the proposed web robot d etection approach. The proposed approach is evaluated in an ecommerce website ac cess logs and content data. The F-score, balanced accuracy, G-mean, and Jaccard similarity are used for performance measures, and the coherence score metric is used to determine the number of topics for a session."
RaipurIndiaAsiaEmerging Technologi esMachine LearningNano-robotRobotRoboticsNational Institute of Technol ogy Raipur