[目的]梳理归纳多模态命名实体识别研究成果,为后续相关研究提供参考与借鉴.[文献范围]在Web of Science、IEEE Xplore、ACM Digital Library、中国知网数据库中,以"多模态命名实体识别""多模态信息抽取""多模态知识图谱"为检索词进行文献检索,共筛选出83篇代表性文献.[方法]从概念、特征表示、融合策略和预训练模型4个方面对多模态命名实体识别研究进行总结论述,指出现存问题和未来研究方向.[结果]多模态命名实体识别目前主要围绕模态特征表示和融合两个方面展开且在社交媒体领域取得了一定进展,需要进一步改进多模态细粒度特征提取和语义关联映射方法以提升模型的泛化性和可解释性.[局限]直接以多模态命名实体识别为研究主题的文献数量较少,在支撑综述结果方面存在局限性.[结论]针对多模态命名实体识别亟需解决的问题展望未来发展趋势,为进一步拓宽多模态学习在下游任务应用的研究范畴、破解模态壁垒和语义鸿沟提供了新思路.
Review of Multimodal Named Entity Recognition Studies
[Objective]This paper reviews multimodal named entity recognition research to provide references for future studies.[Coverage]We selected 83 representative papers using"multimodal named entity recognition","multimodal information extraction",and"multimodal knowledge graph"as the search terms for the Web of Science,IEEE Xplore,ACM digital library,and CNKI databases.[Methods]We summarized the multimodal named entity recognition research in four aspects:concepts,feature representation,fusion strategies,and pre-trained models.We also identified existing problems and future research directions.[Results]Multimodal named entity recognition studies focus on modal feature representation and fusion.It made some progress in the field of social media.They need to improve multimodal fine-grained feature extraction and semantic association mapping methods to enhance the models'generalization and interpretability.[Limitations]There is insufficient literature directly using multimodal named entity recognition as a research topic.[Conclusions]Our study provides new ideas to expand the applications of multimodal learning,break the modal barriers,and bridge the semantic gaps.
Multimodal Named Entity RecognitionFeature RepresentationMultimodal FusionMultimodal Pre-training