昆明冶金高等专科学校学报2024,Vol.40Issue(6) :73-77.DOI:10.3969/j.issn.1009-0479.2024.06.012

基于哈希值计算的数据爬取策略

Study on Data Crawling Strategies Based on Hash Value Calculation

王艳玲
昆明冶金高等专科学校学报2024,Vol.40Issue(6) :73-77.DOI:10.3969/j.issn.1009-0479.2024.06.012

基于哈希值计算的数据爬取策略

Study on Data Crawling Strategies Based on Hash Value Calculation

王艳玲1
扫码查看

作者信息

  • 1. 昆明冶金高等专科学校外语学院(东盟国际学院),云南 昆明 650033
  • 折叠

摘要

大数据时代,网络数据爆炸式增长.要甄别和使用其中的有效数据,必须利用爬取技术大规模地收集相关数据,提取不同数据间的关联和趋势,才能简化数据分析成本,帮助用户在海量数据中完成对所需数据的精确分析、充分理解以及有效应用.基于哈希值计算探究数据爬取策略,从二者的概念及关联性入手,分析利用哈希值计算强化数据爬取效果的方案,从而提高数据爬取有效性,推动数据爬取技术的正向发展.

Abstract

We have entered the era of big data,where the volume of online data is exploding.To identi-fy and utilize the effective data within this vastness,it is necessary to employ crawling techniques to col-lect relevant data on a large scale,extract the associations and trends among different data sets,and thereby simplify the cost of data analysis.This helps users to perform precise analysis,gain a full under-standing,and apply effectively the required data from the massive data pool.Therefore,this paper will explore data crawling strategies based on Hash Value calculation,starting from the concepts and correla-tions between the two,and analyze schemes to enhance the effectiveness of data crawling through Hash Value calculation,thereby improving the validity of data crawling and promoting the positive development of data crawling technology.

关键词

哈希值/数据爬取/大数据

Key words

Hash Value/data crawling/big data

引用本文复制引用

出版年

2024
昆明冶金高等专科学校学报
昆明冶金高等专科学校

昆明冶金高等专科学校学报

影响因子:0.325
ISSN:1009-0479
段落导航相关论文