具备视觉功能的ChatGPT对乳腺超声图像病变的识别能力和诊断价值初探

Preliminary Exploration of ChatGPT with Vision in the Recognition and Diagnostic Value of Breast Ultrasound Lesions

何奕宗 ¹姚振强 ²何小娜 ²李玉山 ²唐语苓 ¹周祖邦²

扫码查看

作者信息

1. 甘肃中医药大学第一临床医学院兰州市,730000
2. 甘肃省人民医院超声医学科兰州市,730000
折叠

摘要

目的通过与低年资和高年资超声诊断医师对比,评估具备视觉功能的聊天生成式预训练转换器4.0版本(ChatGPT4V)对乳腺超声图像病变的识别能力和对乳腺恶性病变的诊断价值.方法从癌症成像档案(TCIA)数据库下载女性患者的乳腺超声图像和临床信息并随机选择50例患者的图像分别让ChatG-PT4V和两名超声诊断医师独立解读,使用McNemar检验比较ChatGPT4V和医师对乳腺超声病灶特征识别、BI-RADS分类的准确度.绘制受试者工作特征(ROC)曲线评价ChatGPT4V和医师鉴别良恶性病变的能力.结果 ChatGPT4V识别病灶形态、边界和钙化的准确度与低年资超声诊断医师相比差异无统计学意义(P＞0.05),识别病灶回声类型和后方回声特征的准确度低于低年资医师(P＜0.05).ChatGPT4V识别病灶边界、后方回声特征和钙化的准确度与高年资医师相比差异均无统计学意义(P＞0.05),识别病灶回声类型和形态的准确度低于高年资医师(P＜0.05).ChatGPT4V判断BI-RADS分类的准确度与超声诊断医师相比差异无统计学意义(P＞0.05).ROC曲线显示ChatGPT4V诊断恶性病变的曲线下面积(AUC)和低年资医师的差异无统计学意义(P=0.421),但低于高年资医师(P=0.031).结论 ChatGPT4V具有一定的识别和解读乳腺超声图像的能力,但ChatGPT4V能否应用于实际的临床诊断实践中仍需要进一步的研究.

Abstract

Objective To evaluate the ability of chat generative pre-trained transformer 4.0 with vision(ChatG-PT4V)to recognize breast ultrasound imaging and its diagnostic value for malignant breast lesions by comparing it with junior and senior sonographers.Methods Breast ultrasound images and clinical information from female patients were downloaded from the cancer imaging archive(TCIA).Images from 50 randomly selected patients were independ-ently interpreted by ChatGPT4V and two sonographers.The McNemar test was used to compare the accuracy of ChatGPT4V and the sonographers in identifying breast lesion characteristics and BI-RADS classification.Receiver op-erating characteristic(ROC)curves were drawn to assess the ability of ChatGPT4V and sonographers to differentiate between benign and malignant lesions.Results ChatGPT4V showed no statistically significant difference in the accu-racy of identifying lesion shape,margins,and calcifications compared to junior sonographers(P＞0.05),but its accu-racy in identifying lesion echo patterns and posterior features was lower than that of junior sonographers(P＜0.05).The accuracy of ChatGPT4V in recognizing lesion margins,posterior features,and calcifications was not significantly different from senior sonographers(P＞0.05),though its accuracy in identifying lesion echo patterns and shape was lower than that of senior sonographers(P＜0.05).The accuracy of ChatGPT4V in determining BI-RADS classifica-tion showed no statistically significant difference compared to that of sonographers(P＞0.05).The ROC curve analy-sis showed that the area under the curve(AUC)for diagnosing malignant lesions by ChatGPT4V was not significantly different from that of junior sonographers(P=0.421),but was significantly lower than that of senior sonographers(P=0.031).Conclusions ChatGPT4V demonstrates potential in recognizing and interpreting breast ultrasound ima-ges;however,further research is required to determine whether it can be applied in actual clinical diagnostic practice.

关键词

大型语言模型/人工智能/乳腺结节/乳腺癌/诊断性能

Key words

Large language model/Artificial intelligence/Breast nodule/Breast cancer/Diagnostic performance

引用本文复制引用

出版年

2025

中国超声医学杂志

中国科学技术信息研究所（ISTIC）中国超声医学工程学会

中国超声医学杂志

CSCD北大核心

影响因子：1.534

ISSN：1002-0101

段落导航