Python爬虫技术在学术聚合系统中的应用

The Application of Python Crawler Technology in Academic Aggregation Systems

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：爬虫技术是搜索引擎和信息网站获取数据的核心技术之一,专用的网络爬虫能够在短时间内从网络上抓取大量有用数据.基于为研究者提供所需学术资源的目的,研究了爬虫技术在爬取学术网站论文数据中的应用.分析了Python爬虫技术在学术聚合系统中的应用,借助大数据技术手段对所爬取的学术数据进行存储、清洗、聚合、消歧和融合.Python爬虫技术在学术聚合系统中起着关键作用,助力研发人员构建强大的数据聚合和分析平台,为学术研究人员提供有价值的信息资源,对学术研究、文献检索和信息发现都具有重要意义.

外文摘要：Crawler technology is one of the core technologies for search engines and information websites to obtain data.Specialized web crawlers can quickly crawl a large amount of useful data from the network.In order to meet the needs of researchers crawling academic paper data on academic websites to obtain academic resources,the application of Python crawler technology in academic aggregation systems is studied.With the help of big data technology,the crawled academic data is stored,cleaned,aggregated,disambiguated,and fused.Python crawler technology plays a crucial role in academic aggregation systems,helping developers build powerful data aggregation and analysis platforms,providing valuable information resources for academic researchers,and is of great significance for academic research,literature retrieval,and information discovery.

外文关键词：

Python crawleracademic resourcebig data technologyacademic aggregation system

作者：

崔梦银、邓茵、刘满意

展开 >

作者单位：

广东科技学院,广东东莞 523083

深圳市环讯通科技有限公司,广东深圳 518000

关键词：

Python爬虫学术资源大数据技术学术聚合系统

出版年：

2024