首页|Large scale MicroBlog location data capture method based on dynamic web page parsing
Large scale MicroBlog location data capture method based on dynamic web page parsing
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
Due to the large scale of data, the deviation coefficient of the captured data is large and the capture efficiency is low. To this end, a large-scale Weibo location data retrieval method based on dynamic web page parsing is proposed. Firstly, based on the source of Weibo location data, artificial neural models and random functions are introduced to calculate the weights of feature data. Next, generate a feature vector table and classifier model, and filter the feature text using the established classification model. Finally, by matching the feature data of Weibo location data between dynamic script sites and web pages, a dynamic script parsing framework for Weibo location data on web pages is constructed, and dynamic web page parsing technology is used to capture Weibo location data. The experimental results show that the proposed method has only a 0.1% error in data capture bias, and the capture efficiency reaches 99%. Therefore, this method can significantly improve the crawling effect of large-scale Weibo location data and has certain feasibility.
dynamic web page parsingMicroBlog location datacrawlingartificial neuron modelrandom functiondynamic script site
Yu Ji、Huanhuan Liu、Zhenzhen Wang、Rui Sun
展开 >
Organization Department, Hebei Institute of Mechanical and Electrical Technology
Department of Information Engineering, Hebei Institute of Mechanical and Electrical Technology