首页|生成式人工智能在内容分析中的应用及测量效度评估

生成式人工智能在内容分析中的应用及测量效度评估

扫码查看
本研究旨在考察以GPT为代表的生成式人工智能模型在内容分析研究中的应用前景及潜在效度折损问题.通过分析与气候变化相关的中英文社交媒体文本数据,本研究从语言/数据集、提示微调策略以及模型版本三个维度系统评估了GPT模型在新闻传播学核心概念(认知、情感和立场)编码上的效度差异及其背后的潜在原因.研究表明,GPT倾向于过度识别和解读文本内容,并表现出对"中立文本"的偏见.在多维度比较上,本研究并未发现GPT在概念编码效度上存在明显的跨语言/数据集差异;GPT-4较其3.5版本在部分类目中显示出更高的测量效度;经提示微调的GPT模型能够在一定程度上提升编码的准确性,但引入更多示例样本可能会导致一定程度的效度损失.此外,本研究还发现文本的词汇和语义特征会影响GPT的测量效度.
Application and Measurement Validity Evaluation of Generative Artificial Intelligence in Content Analysis
This study aims to explore the application prospects and the possible validity loss of Generative Artificial Intelligence(AI)models such as GPT in content analysis research.By analyzing Chinese and English social media texts related to climate change,this study systematically evaluates the differences in measurement validity of GPT in coding three core concepts(i.e.,cognition,emotion,and stance)of journalism and communication studies across various dimensions:language/dataset,prompt-tuning strategy,and GPT model version.Additionally,it examines the potential reasons behind these differences.Findings reveal that GPT tends to over-interpret textual content and shows a bias toward"neutral texts".In multidimensional comparisons,no significant cross-linguistic/dataset differences were found;GPT-4 shows higher measurement validity in some categories compared to its 3.5 version.Also,the study discloses that the prompt-tuned GPT model canimprove coding accuracy to some extent,but introducing more examples may lead to a certain degree of validity loss.Furthermore,this research finds that the word-and semantic-level features of text can affect the measurement validity of GPT.

GPTLarge Language ModelContent AnalysisGenerative AIValidity

程萧潇、吴栎骞

展开 >

浙江大学传媒与国际文化学院

GPT 大语言模型 内容分析 生成式人工智能 效度

2023年度国家社会科学基金青年项目

23CXW034

2024

全球传媒学刊

全球传媒学刊

CSSCI
ISSN:
年,卷(期):2024.11(2)
  • 6