现代信息科技2024,Vol.8Issue(10) :68-74.DOI:10.19850/j.cnki.2096-4706.2024.10.015

Python爬虫技术在学术聚合系统中的应用

The Application of Python Crawler Technology in Academic Aggregation Systems

崔梦银 邓茵 刘满意
现代信息科技2024,Vol.8Issue(10) :68-74.DOI:10.19850/j.cnki.2096-4706.2024.10.015

Python爬虫技术在学术聚合系统中的应用

The Application of Python Crawler Technology in Academic Aggregation Systems

崔梦银 1邓茵 1刘满意2
扫码查看

作者信息

  • 1. 广东科技学院,广东东莞 523083
  • 2. 深圳市环讯通科技有限公司,广东 深圳 518000
  • 折叠

摘要

爬虫技术是搜索引擎和信息网站获取数据的核心技术之一,专用的网络爬虫能够在短时间内从网络上抓取大量有用数据.基于为研究者提供所需学术资源的目的,研究了爬虫技术在爬取学术网站论文数据中的应用.分析了Python爬虫技术在学术聚合系统中的应用,借助大数据技术手段对所爬取的学术数据进行存储、清洗、聚合、消歧和融合.Python爬虫技术在学术聚合系统中起着关键作用,助力研发人员构建强大的数据聚合和分析平台,为学术研究人员提供有价值的信息资源,对学术研究、文献检索和信息发现都具有重要意义.

Abstract

Crawler technology is one of the core technologies for search engines and information websites to obtain data.Specialized web crawlers can quickly crawl a large amount of useful data from the network.In order to meet the needs of researchers crawling academic paper data on academic websites to obtain academic resources,the application of Python crawler technology in academic aggregation systems is studied.With the help of big data technology,the crawled academic data is stored,cleaned,aggregated,disambiguated,and fused.Python crawler technology plays a crucial role in academic aggregation systems,helping developers build powerful data aggregation and analysis platforms,providing valuable information resources for academic researchers,and is of great significance for academic research,literature retrieval,and information discovery.

关键词

Python爬虫/学术资源/大数据技术/学术聚合系统

Key words

Python crawler/academic resource/big data technology/academic aggregation system

引用本文复制引用

出版年

2024
现代信息科技
广东省电子学会

现代信息科技

ISSN:2096-4706
段落导航相关论文