Tracing and analyzing false information is important tools to suppress the spread of false information in social net-works.Traditional traceability methods are mainly used for structured data,so it is difficult to accurately judge the derivation re-lation between texts.To solve the above problems,Bayesian Network and RoBERTa ensembles for text derivation relation mi-ning was proposed.The text vector was obtained by RoBERTa model.RoBERTa model was used to preliminarily predict the derivation relation between the texts and get the classification label of whether the text had derivation relation.The Bayesian net-work was constructed by taking distance measurement information between texts and vectors,time span information and text classification labels to judge the text derivation relation.Experimental results show that the precision,recall,Fl value of the proposed method are higher than those of comparison methods,verifying the effectiveness of this method.
关键词
数据溯源/文本派生/贝叶斯网/预训练语言模型/派生关系/文本距离/概率模型
Key words
data provenance/text derivation/Bayesian network/per-trained language model/derivation relation/text distance/probabilistic models