首页|基于对比优化的多输入融合拼写纠错模型

基于对比优化的多输入融合拼写纠错模型

扫码查看
文本编辑工作中,中文拼写纠错必不可少.现有中文拼写纠错模型大多为单输入模型,语义信息和纠错结果存在局限性.因此,文中提出基于对比优化的多输入融合拼写纠错模型,包含多输入语义学习阶段和对比学习驱动的语义融合纠错阶段.第一阶段集成多个单模型的初步纠错结果,为语义融合提供充分的互补语义信息.第二阶段基于对比学习方法优化多个互补的句子语义,避免模型过度纠正句子,同时融合多个互补语义对错误句子进行再纠错,改善模型纠错结果的局限性.在SIGHAN13、SIGHAN14、SIGHAN15数据集上的实验表明文中方法可有效提升纠错性能.
Multi-input Fusion Spelling Error Correction Model Based on Contrast Optimization
Chinese spelling correction is essential in text editing.Most of the existing Chinese spelling error correction models are single input models,and there are limitations in the semantic information and error correction results of the models.In this paper,a multi-input fusion spelling error correction method based on contrast optimization,MIF-SECCO,is proposed.MIF-SECCO contains two stages:multi-input semantic learning and contrast learning-driven semantic fusion error correction.In the first stage,preliminary error correction results from multiple single input models are integrated to provide sufficient complementary semantic information for semantic fusion.In the second stage,multiple complementary sentence semantics are optimized based on the contrastive learning approach to avoid over-correction of sentences by the model.The limitations of error correction results of the model are improved by fusing multiple complementary semantics for re-correction of erroneous sentences.Experimental results on the public datasets SIGHAN13,SIGHAN14 and SIGHAN15 demonstrate MIF-SECCO effectively improves the error correction performance of the model.

Chinese Spelling Error CorrectionMulti-input Semantic LearningComplementary Seman-tic FusionContrastive Learning Optimization

伍瑶瑶、黄瑞章、白瑞娜、曹军航、赵建辉

展开 >

贵州大学文本计算与认知智能教育部工程研究中心贵阳 550025

贵州大学公共大数据国家重点实验室 贵阳 550025

贵州大学计算机科学与技术学院 贵阳 550025

中文拼写纠错 多输入语义学习 互补语义融合 对比学习优化

国家自然科学基金贵州省科技支撑计划项目

620660072022277

2024

模式识别与人工智能
中国自动化学会,国家智能计算机研究开发中心,中国科学院合肥智能机械研究所

模式识别与人工智能

CSTPCD北大核心
影响因子:0.954
ISSN:1003-6059
年,卷(期):2024.37(1)
  • 23