CHATGPT GENERATED TEXT DETECTION MODEL BASED ON SHAPLEY ADDITIVE EXPLANATIONS
In order to quickly identify whether text content is generated by ChatGPT,this paper proposes an AI generated text detection model based on BERT-BiGRU.We used pre-trained BERT(bidirectional encoder representations from transformers)to extract semantic features of the text,and used BiGRU(bidirectional gated recurrent unit)for comprehensively extracted feature.The classification performance of the BERT-BiGRU classification model on the AI generated detection dataset HC3(human ChatGPT comparison corpus)was evaluated.The shapley additive exPlanations(SHAP)was introduced to compare and analyze the keywords and benchmark values identified by different models from both global and local dimensions.Experimental results show that although both deep learning and pre-trained BERT classification models have achieved good classification accuracy,their performance has seriously declined on unlearned datasets,BERT-BiGRU model still has high accuracy.These models'keywords which are calculated on same dataset are quite different,and most of the keywords are numbers,rare characters,and punctuation.These models don't truly understand the inherent characteristics of real human-written text and AI generated texts,models trained on existing closed datasets cannot truly cope with open practical application scenarios.
ChatGPTSHAPBERTBiGRUHC3AI generated text detection