首页|文本识别及分析工具的开发与应用——以《全国报刊索引》数据库为例

文本识别及分析工具的开发与应用——以《全国报刊索引》数据库为例

Development and Application of Text Recognition and Analysis Tools——Example of National Index to Chinese Newspaper&Periodicals Database

扫码查看
由于全国报刊索引中有大量报刊数据,为了深入研究报刊数据间的沿革关系,利用文本分析工具挖掘现有的报刊数据,对有关系的数据再使用知识图谱工具绘画出特定的场景化知识图谱,本文结合目前已发表的关于文本分析、知识抽取文献及开源代码,深入研究并结合情况开发一款报刊沿革工具,用以抽取沿革数据.此次开发的工具可高效率地完成数据库中近2.4w种报刊数据的分析,并抓取到重要的沿革关系,为后续绘制知识图谱打下了重要基础.
Due to the large number of newspaper and periodical data in the National Index to Chinese Newspaper&Periodicals,in order to deeply study the historical relationship of newspaper and periodical data,the text analysis tool is used to mine the existing newspaper and periodical data,and then the knowledge graph tool is used to plot a specific scenario-based knowledge graph for the relational data.Based on the published literature on text analysis,knowledge extraction and open source code,this paper conducts in-depth research and develops a newspaper and periodical history tool to extract historical data.The tool developed can efficiently complete the data analysis of nearly 24000 newspapers and periodicals in the database,and capture important historical relationships,which lay an important foundation for the subsequent drawing of knowledge graphs.

knowledge graphknowledge extractiontext analysis

姜嘉佳

展开 >

上海图书馆(上海科学技术情报研究所) 上海 200000

知识图谱 知识抽取 文本分析

2024

科学与信息化

科学与信息化

ISSN:
年,卷(期):2024.(19)