生成式人工智能在内容分析中的应用及测量效度评估

扫码查看

原文链接

万方数据
维普

中文摘要：本研究旨在考察以GPT为代表的生成式人工智能模型在内容分析研究中的应用前景及潜在效度折损问题.通过分析与气候变化相关的中英文社交媒体文本数据,本研究从语言/数据集、提示微调策略以及模型版本三个维度系统评估了GPT模型在新闻传播学核心概念(认知、情感和立场)编码上的效度差异及其背后的潜在原因.研究表明,GPT倾向于过度识别和解读文本内容,并表现出对"中立文本"的偏见.在多维度比较上,本研究并未发现GPT在概念编码效度上存在明显的跨语言/数据集差异;GPT-4较其3.5版本在部分类目中显示出更高的测量效度;经提示微调的GPT模型能够在一定程度上提升编码的准确性,但引入更多示例样本可能会导致一定程度的效度损失.此外,本研究还发现文本的词汇和语义特征会影响GPT的测量效度.

外文标题：Application and Measurement Validity Evaluation of Generative Artificial Intelligence in Content Analysis

外文摘要：This study aims to explore the application prospects and the possible validity loss of Generative Artificial Intelligence(AI)models such as GPT in content analysis research.By analyzing Chinese and English social media texts related to climate change,this study systematically evaluates the differences in measurement validity of GPT in coding three core concepts(i.e.,cognition,emotion,and stance)of journalism and communication studies across various dimensions:language/dataset,prompt-tuning strategy,and GPT model version.Additionally,it examines the potential reasons behind these differences.Findings reveal that GPT tends to over-interpret textual content and shows a bias toward"neutral texts".In multidimensional comparisons,no significant cross-linguistic/dataset differences were found;GPT-4 shows higher measurement validity in some categories compared to its 3.5 version.Also,the study discloses that the prompt-tuned GPT model canimprove coding accuracy to some extent,but introducing more examples may lead to a certain degree of validity loss.Furthermore,this research finds that the word-and semantic-level features of text can affect the measurement validity of GPT.

外文关键词：

GPTLarge Language ModelContent AnalysisGenerative AIValidity

作者：

程萧潇、吴栎骞

展开 >

作者单位：

浙江大学传媒与国际文化学院

关键词：

GPT 大语言模型内容分析生成式人工智能效度

基金：

2023年度国家社会科学基金青年项目

项目编号：

23CXW034

出版年：

2024

DOI：

10.26599/GJMS.2024.9330015

全球传媒学刊

CSSCI

ISSN：

年,卷(期)：2024.11(2)

参考文献量6