Research on the Differences and Recognition between Scholar Writing and AI Generated Content:Taking the Research Field of Library Health Services as an Example
In order to empirically analyze the characteristics and differences between scholars writing abstracts and GPT-4 generated abstracts in the field of library health services research,this paper selects 185 publicly published aca-demic papers related to library health services as the research subjects,and based on the obtained paper title,adopts the Prompt method and applies GPT-4 to generate the corresponding abstract text and construct a dataset,applies HanLP 2.1 to segment the abstract of the paper and uses TF-IDF for vectorization processing;cleans the data through 6 feature screenings and 6 data dimensionality reductions;Traverse 13 machine learning methods and analyze the results from the perspective of text content;traverses 13 machine learning methods and analyzes the results from the perspective of text content.Research has found that the LightGBM classification method can completely distinguish whether the abstract of a paper is written by a scholar or generated by GPT-4 under the premise of data dimensionality reduction;from the perspec-tive of word count,word count,and sentence count in the text,the scholar's writing and GPT-4 generation are basically consistent;from the analysis of the topic model,the similarity between the two reaches 50%,indicating a certain degree of similarity between scholar writing and GPT-4 generation.Machine learning algorithms have potential applications in distinguishing between AI generated content and scholar written content,but there is a clear phenomenon of"resem-blance"rather than"resemblance"between the two.Scholars should pay attention to the accuracy,authenticity,and logi-cal reasoning of AI generated content,and use AI tools with caution.