首页|基于混合神经网络和注意力机制的生物医学事件触发词识别方法

基于混合神经网络和注意力机制的生物医学事件触发词识别方法

扫码查看
生物医学事件作为生物医学文本挖掘的重要组成部分,在生物医学研究和疾病的预防中发挥着重要作用.触发词识别是生物医学事件抽取的关键和前提步骤,旨在提取描述事件类型的关键词.传统方法在特征提取过程中过分依赖自然语言处理工具,导致耗费人工成本.另外,由于生物医学文献的特殊性—长文本语句多,导致长距离依赖问题比较明显.为了解决这些问题,我们提出了一种混合结构,由残差卷积神经网络和双向长短期神经网络、混合神经网络和多头注意力机制组成.该模型利用残差卷积神经网络提取单词级特征并利用双向长短期神经网络提取上下文语义信息.此外,本文通过空间域滑动窗口将长句划分为等长短句,在不破坏上下文信息的前提下,避免了长距离依赖.实验结果表明,本文提出的方法在生物医学事件抽取通用语料MLEE(Multi-Level Event Extraction)上取得了较好的效果,F值达到81.15%.
A Biomedical Event Trigger Identification Method Based on Hybrid Neural Network and Attention Mechanism
Biomedical events,as an important part of biomedical text mining,play an important role in biomedical re-search and disease prevention.Trigger identification is the key and prerequisite step of biomedical event extraction,which aims to extract the key words describing event types.Traditional trigger identification methods rely too much on natural lan-guage processing tools in the process of feature extraction,consuming a lot of manual cost.In addition,due to the particular-ity of biomedical literature—there are many long text sentences,the problem of long-distance dependence is obvious.To solve these problems,we propose a hybrid structure,which is composed of residual convolution neural network and bidirec-tional long short term memory,hybrid neural network and multi head attention mechanism.The proposed model uses residu-al convolution neural network to extract vocabulary-level features and bidirectional long short term memory to obtain con-textual semantic information.Furthermore,spatial domain sliding windows divide long sentences into equal-length short sentences without damaging context information,which can avoid long-distance dependency without destroying the context information.The experimental results show that our method outperforms the state-of-the-art methods on the commonly used multi-level event extraction(MLEE)corpus,achieving 81.15%F-score.

biomedical event extractiontrigger detectionReCNN-BiLSTMspatial domain sliding windowMUH-Attentionmixed neural network

任永功、林禹竹、唐玉洁、于博、何馨宇

展开 >

辽宁师范大学计算机与人工智能学院,辽宁 大连 116081

大连理工大学通信与工程博士后研究站,辽宁 大连 116081

大连永佳电子技术有限公司博士后工作站,辽宁 大连 116081

生物医学事件抽取 触发词识别 ReCNN-BiLSTM 空间域滑动窗口 MUH-Attention机制 混合神经网络

国家自然科学基金国家自然科学基金辽宁省"兴辽英才计划"项目辽宁省普通高等教育本科教学改革研究项目辽宁省高等学校科学研究项目辽宁师范大学本科教学改革研究与实践项目辽宁省科技厅重点研发项目

6200610861976109XLYC2006005辽教通[2022]166号LJKZ0963LSJG2022102022JH2/101300271

2024

电子学报
中国电子学会

电子学报

CSTPCD北大核心
影响因子:1.237
ISSN:0372-2112
年,卷(期):2024.52(9)