首页|How good are large language models at product risk assessment?

How good are large language models at product risk assessment?

扫码查看
Product safety professionals must assess the risks to consumers associated with theforeseeable uses and misuses of products. In this study, we investigate the utility ofgenerative artificial intelligence (AI), specifically large language models (LLMs) suchas ChatGPT, across a number of tasks involved in the product risk assessment process.For a set of six consumer products, prompts were developed related to failuremode identification, the construction and population of a failure mode and effects analysis(FMEA) table, risk mitigation identification, and guidance to product designers,users, and regulators. These prompts were input into ChatGPT and the outputs wererecorded. A survey was administered to product safety professionals to ascertain thequality of the outputs. We found that ChatGPT generally performed better at divergentthinking tasks such as brainstorming potential failure modes and risk mitigations.However, there were errors and inconsistencies in some of the results, and the guidanceprovided was perceived as overly generic, occasionally outlandish, and not reflectiveof the depth of knowledge held by a subject matter expert. When tested against a sampleof other LLMs, similar patterns in strengths and weaknesses were demonstrated.Despite these challenges, a role for LLMs may still exist in product risk assessment toassist in ideation, while experts may shift their focus to critical review of AI-generatedcontent.

FMEAGenerative AIProduct safety

Zachary A. Collier、Richard J. Gruss、Alan S. Abrahams

展开 >

Department of Management, Radford University,Radford, Virginia, USA

Department of Business Information Technology,Virginia Tech, Blacksburg, Virginia, USA

2025

Risk analysis

Risk analysis

ISSN:0272-4332
年,卷(期):2025.45(4)
  • 77