关键词抽取是指能自动抽取反映文本主题的词或者短语,被广泛应用于文本检索、文本摘要等领域中.目前关键词抽取任务主要依赖于预训练语言模型来获取文本表示,这类语言模型主要基于单一模态的通用文本语料进行训练,存在无法根据下游任务特性进行领域适配和语义表征能力有限的问题.该文提出一种多模态信息增强表示的中文关键词抽取方法MIEnhance-KPE,首先引入Adapter层将偏旁和部首信息集成到预训练语言模型层中,得到领域自适应的文本表示;其次利用卷积神经网络提取汉字的图像特征,同时使用交叉注意力机制融合汉字图像特征和文本特征,实现文本语义表示增强;最后利用条件随机场(conditional random field,CRF)模型进行序列标注任务,并计算词语的位置-词频权重对其进行排序获得关键词.与目前十分先进的关键词抽取方法KIEMP相比,MIEnhance-KPE在公开的中文科学文献数据集和自构建的中文教育关键词抽取数据集上的F值分别提升了 15.71%和3.40%;消融实验结果表明,所提出的领域自适应模块和视觉语义增强表示模块均能有效提高关键词抽取的准确性.MIEnhance-KPE的提出有助于教育研究者精准了解教育发展趋势,促进教育理论和实践的创新.
A Chinese keyphrase extraction method for multimodal information enhancement representation
[Objective]At present,China is undergoing a critical digital transformation in education.This shift has led to an explosive growth of educational content online,presenting a challenge for researchers who find it increasingly difficult to sift through massive amounts of text data.The necessity to quickly grasp important information has made keyphrase extraction an invaluable tool.Keyphrase extraction automates the process of identifying words or phrases that encapsulate the main themes of a text,proving critical for text retrieval,text summary,and other tasks.Despite its importance,the current keyphrase extraction tasks mainly rely on pretrained language models to obtain text representation.These models are often trained based on a generic text corpus and struggle to adapt to specific domains according to the characteristics of downstream tasks owing to their limited ability to capture the subtle semantic representation of single-mode information.Therefore,developing methods for accurate and efficient keyphrase extraction from massive texts remains a pressing research challenge.[Methods]This paper presents a novel approach for Chinese keyphrase extraction,dubbed multimodal information enhancement representation for keyphrase extraction(MIEnhance-KPE).Our method first deconstructs characters into radicals using a character splitting dictionary and extracts radical features through a convolutional neural network.At the same time,we integrate a trainable adapter layer between the transformer layers of a pretrained language model.Through the above operations,the bottom level semantic features of the pretrained language model and radical features are fully integrated to obtain a domain adaptive text representation.Characters are then transformed into glyph images representing different periods in history and writing styles.Subsequently,we employ group convolution to extract the glyphic features of these characters.Meanwhile,a cross-attention mechanism is used to fuse the glyphic and text features,yielding richer and more comprehensive semantic representations.The final step involves using a conditional random field model to learn the relationship between the fused features and labels.Through sequence labeling,we identify candidate keyphrases,ranking them based on position and word frequency weight to determine the most relevant keyphrases.[Results]MIEnhance-KPE's performance was tested using two datasets:the published Chinese Scientific Literature(CSL)and the self-constructed Chinese Education Keyphrase Extraction Dataset(CEKED).Our method demonstrated a substantial improvement compared to the most advanced keyphrase extraction methods,with F values increasing by 15.71%and 3.40%on the CSL and CEKED datasets,respectively.Ablation experiments further confirmed the effectiveness of both the domain adaptive module and the visual semantic enhancement module in enhancing keyphrase extraction accuracy.In addition,this paper explored various methods for fusing glyphic and semantic features,concluding that the cross-attention mechanism excels in adaptively merging different features to improve task accuracy.[Conclusions]The MIEnhance-KPE proposed in this paper can considerably improve the accuracy of keyphrase extraction tasks.This aids educational researchers in quickly locating relevant literature and understanding the cutting-edge trends of educational development.Additionally,MIEnhance-KPE introduces a novel approach to literature analysis in the educational sector.It provides a solid data foundation for examining the motivation of educational reform and innovation,thereby accelerating the digital transformation process in education.
Chinese keyphrase extractionmultimodal informationmultigranularity semantic featurescross-attention mechanismdomain adaptation