首页|多视图融合DJ-TextRCNN的古籍文本主题推荐研究

多视图融合DJ-TextRCNN的古籍文本主题推荐研究

扫码查看
传统编目分类和规则匹配方法存在工作效能低、过度依赖专家知识、缺乏对古籍文本自身语义的深层次挖掘、编目主题边界模糊、较难实现对古籍文本领域主题的精准推荐等问题.为此,本文结合古籍语料特征探究如何实现精准推荐符合研究者需求的文本主题内容的方法,以推动数字人文研究的进一步发展.首先,选取本课题组前期标注的古籍语料数据进行主题类别标注和视图分类;其次,构建融合BERT(bidirectional encoder representation from transformers)预训练模型、改进卷积神经网络、循环神经网络和多头注意力机制的语义挖掘模型;最后,融入"主体-关系-客体"多视图的语义增强模型,构建DJ-TextRCNN(DianJi-recurrent convolutional neural networks for text classification)模型实现对典籍文本更细粒度、更深层次、更多维度的语义挖掘.研究结果发现,DJ-TextRCNN模型在不同视图下的古籍主题推荐任务的准确率均为最优.在"主体-关系-客体"视图下,精确率达到88.54%,初步实现了对古籍文本的精准主题推荐,对中华文化深层次、细粒度的语义挖掘具有一定的指导意义.
Multi-view Fusion DJ-TextRCNN for the Theme Recommendation of Ancient Texts
Progress in digital humanities research is hindered by issues such as low working efficiency,blurred boundaries of cataloging topics,excessive reliance on expert knowledge,lack of in-depth mining of the semantics of ancient texts,and difficulty in accurately recommending topics in the field of ancient texts by combining the characteristics of ancient book texts.In this regard,this study aimed to realize the accurate recommendation of text theme content that satisfies the needs of researchers on the basis of the characteristics of ancient book corpora.First,ancient book corpus data annotated by the research group in the early stage were selected for subject category labeling and view classification.Second,a semantic mining model integrating a pretrained BERT model and improved convolutional neural network,recurrent neural net-work,and multi-head attention mechanism was constructed.Finally,the multi-view semantic enhancement model of"subject-relationship-object"was integrated to construct the DJ-TextRCNN model to realize more fine-grained,deeper,and multi-dimensional semantic mining of classic texts.The DJ-TextRCNN model achieved the best accuracy of ancient book theme recommendation tasks in different views.Under the"subject-relationship-object"view,an accuracy rate of 88.54%was reached,and the accurate theme recommendation of ancient texts was preliminarily realized.The model can help guide in-depth and fine-grained semantic mining of Chinese culture.

digital humanitiesancient textstheme recommendationmulti-view fusionDJ-TextRCNN

武帅、杨秀璋、何琳

展开 >

南京农业大学信息管理学院,南京 211800

武汉大学国家网络安全学院,武汉 430072

贵州财经大学信息学院,贵阳 550025

数字人文 古籍文本 主题推荐 多视图融合 DJ-TextRCNN

国家社会科学基金重大项目

22&ZD262

2024

情报学报
中国科学技术情报学会 中国科学技术信息研究所

情报学报

CSTPCDCSSCICHSSCD北大核心
影响因子:1.296
ISSN:1000-0135
年,卷(期):2024.43(1)
  • 34