首页|大模型在社会治理应用中的偏见性检测方法

大模型在社会治理应用中的偏见性检测方法

扫码查看
随着人工智能技术的快速发展,大模型的分析、生成、推理能力在社会治理应用中的许多领域都得到了广泛应用.例如,互联网舆情分析等.然而,由于大模型的训练数据主要源于开源世界,其中的大量文本数据可能包含各种偏见和歧视,这可能在应用大模型进行社会治理时产生偏见.本研究对大型语言模型(如GPT-3.5、LLaMa)的政治偏见进行了深入探讨.尽管大模型的训练机构坚称其公正,但研究文献表明,大模型在回应特定争议性话题时,常常展示出偏见,这可能导致社会治理的问题.本研究提出了一种偏见性检测方法,要求多个大模型扮演支持或反对特定目标(如政策、人物、话题等)的角色进行问题和答案的设计,并使用额外的大模型进行测试,通过比较这些答案和特定角色生成的答案,以推断接受测试的模型是否存在政治偏见.为降低对生成文本随机性的担忧,对同一问题进行了多次收集答案,每轮都对问题顺序进行随机化处理.实验结果显示,不同的大型语言模型在处理争议话题时都表现出显著的偏见性.这些结果引起了深刻关注,即大模型的直接应用可能会扩大甚至放大争议性信息.本研究的发现,对政策制定者、媒体、政治和学术利益相关者具有重要启示作用.
Methods for Detecting Bias in Large-Scale Models Applied to Social Governance
With the rapid development of artificial intelligence,large language models(LLM)have been widely applied in many areas of social governance due to their analytical,generative,and reasoning capa-bilities.For instance,in public opinion monitoring on the Internet.However,as large models are pre-dominantly trained on open-source data,much of the textual data likely contains various biases which may lead to prejudices when deploying the models for social governance.This research thoroughly investigates political bias in large language LLMs like GPT-3.5 and LLama.Despite purported impartiality by training institutions,literature indicates LLMs frequently exhibit bias when responding to controversial topics,po-tentially causing issues in governance.We propose a novel empirical methodology requiring multiple LLMs to role-play supporting or opposing targets and design corresponding question-answering.Additional LLMs are then tested by comparing their answers against role-generated ones to infer model political bia-ses.To alleviate concerns over answer randomness,multiple responses per question were collected with randomized ordering.Experiments demonstrate significant partisan biases in different LLMs when han-dling contentious issues.These findings raise profound concerns that direct LLM application could amplify controversial information.Our research provides important implications for policymakers,media,and po-litical and academic stakeholders.

large language models(LLM)bias detectionsocial governance

林晖、郭庆浪、王迎雪、黄虎

展开 >

中国电子科学研究院,北京 100041

广东省公共服务供给智能计算重点实验室,深圳 518055

社会安全风险感知与防控大数据应用国家工程研究中心,北京 100041

北京大学深圳研究生院,广东深圳 518055

展开 >

大模型 偏见性检测 社会治理

国家重点研发计划国家重点研发计划

2021YFC33005002022YFC0869800

2024

中国电子科学研究院学报
中国电子科学研究院

中国电子科学研究院学报

影响因子:0.663
ISSN:1673-5692
年,卷(期):2024.19(1)
  • 31