To solve the problem that the contrastive learning methods are not accurate in similarity measure of hard question sen-tences,a similarity measure method for questions based on corrected-pairwise contrastive learning was proposed.Semantic fea-tures were captured using pretrained model followed by Bi-LSTM module.Sentence embedding was composed using attention mechanism and average pooling strategy to focus on key information and semantic representation.A corrected-pairwise contras-tive loss function was designed to promote similarity scores and separability in the semantic space.Experimental results show that the proposed method achieves better F1 values and accuracy in similarity measure for questions compared to the baseline models.
关键词
相似性判别/对比学习/句向量/语义表征/预训练模型/自注意力机制/自然语言处理
Key words
similarity measure/contrastive learning/sentence embedding/semantic representation/pretrained model/self-atten-tion mechanism/natural language processing