ChatGPT生成与学者撰写文献摘要的对比研究—

ChatGPT生成与学者撰写文献摘要的对比研究——以信息资源管理领域为例

Comparative Study on ChatGPT and Scholars'Abstracts:Taking the Field of Information Resource Man-agement as an Example

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：[目的/意义]探究ChatGPT生成与学者撰写的中文论文摘要之间的异同,为AI生成学术论文检测及相关研究提供借鉴.[方法/过程]首先,以信息资源管理领域为例,分别抽取图书馆学、情报学、档案学近3年各500篇高被引论文,基于获取的论文题目采用Prompt方式应用ChatGPT工具生成对应的摘要文本,构建数据集合;其次,采用9种机器学习及深度学习算法对ChatGPT生成与学者撰写的摘要文本进行分类检测;最后,从文本特征、主题模型、ROUGE评测对二者的异同进行多角度分析,从而揭示二者之间的异同点.[结果/结论]基于数据集所训练的主流机器学习及深度学习算法可以有效地分辨摘要是AI生成还是学者撰写,其中BERT和ERNIE的效果最好,而机器学习算法中RF和Xgboost效果最好.ChatGPT生成的摘要字符数量、句子数量较学者撰写的要多,关键词多为模版化的转折性词语;两者的文本主题大部分相同,在"学科体系""数字人文"等主题上存在差异;ROUGE及余弦相似度定量分析表明ChatGPT生成的摘要与学者撰写的摘要文本存在明显的"形似"而非"神似"的现象.

外文摘要：[Purpose/Significance]To explore the similarities and differences of Chinese abstracts written by ChatGPT and scholars can provide references for AI-generated academic paper detection and related re-search.[Method/Process]Firstly,taking the field of information resource management as an example,this paper extracted 500 highly cited papers from library science,information science,and archival science in the recent years.Based on the obtained paper titles,it used the prompt method to apply the ChatGPT tool to generate cor-responding abstract texts and construct a dataset.Secondly,it adopted 9 machine learning and deep learning al-gorithms to classify and detect abstract texts generated by ChatGPT and written by scholars.Finally,it analyzed and revealed the similarities and differences between the two from multiple perspectives,including text features,topic models,and ROUGE evaluation.[Result/Conclusion]Mainstream machine learning and deep learning algorithms trained on datasets can effectively distinguish whether abstracts are generated by AI or written by scholars,with BERT and ERNIE performing best,while RF and Xgboost best in machine learning algorithms.The number of abstract characters and sentences generated by ChatGPT is more than that written by scholars,and the keywords are mostly template-based words.The themes of the two are mostly the same,but different in"disciplinary system"and"digital humanities".The quantitative analysis of ROUGE and cosine similarity indi-cates that the abstracts generated by ChatGPT have a significant"shape-like"rather than"spirit-like"to that by scholars.

外文关键词：

ChatGPTtext classificationtext featurespaper abstract

作者：

张强、王潇冉、高颖、王常珏、周洪

展开 >

作者单位：

安徽工程大学人文学院芜湖 241000

华中师范大学信息管理学院武汉 430079

安徽工程大学计算机与信息学院芜湖 241000

中国科学院大学经济与管理学院北京 101190

中国科学院武汉文献情报中心武汉 430071

展开 >

关键词：

ChatGPT 文本分类文本特征论文摘要

基金：

安徽省省级教学研究重大项目中国科学院武汉文献情报中心青年领军人才项目

项目编号：

2020jyxm0152E0KZ451

出版年：

2024

DOI：

10.13266/j.issn.0252-3116.2024.08.004

图书情报工作

中国科学院文献情报中心

图书情报工作

CSTPCDCSSCICHSSCD北大核心

影响因子：2.203

ISSN：0252-3116

年,卷(期)：2024.68(8)

参考文献量26