首页|Shanghai Jiao Tong University School of Medicine Reports Findings in Artificial Intelligence (Leveraging Large Language Models for Improved Patient Access and S elf-Management: Assessor-Blinded Comparison Between Expert- and AI-Generated Con tent)

Shanghai Jiao Tong University School of Medicine Reports Findings in Artificial Intelligence (Leveraging Large Language Models for Improved Patient Access and S elf-Management: Assessor-Blinded Comparison Between Expert- and AI-Generated Con tent)

扫码查看
By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News – New research on Artificial Intelligenc e is the subject of a report. According to news reporting out of Shanghai, Peopl e’s Republic of China, by NewsRx editors, research stated, “While large language models (LLMs) such as ChatGPT and Google Bard have shown significant promise in various fields, their broader impact on enhancing patient health care access an d quality, particularly in specialized domains such as oral health, requires com prehensive evaluation. This study aims to assess the effectiveness of Google Bar d, ChatGPT-3.5, and ChatGPT-4 in offering recommendations for common oral health issues, benchmarked against responses from human dental experts.” Our news journalists obtained a quote from the research from the Shanghai Jiao T ong University School of Medicine, “This comparative analysis used 40 questions derived from patient surveys on prevalent oral diseases, which were executed in a simulated clinical environment. Responses, obtained from both human experts an d LLMs, were subject to a blinded evaluation process by experienced dentists and lay users, focusing on readability, appropriateness, harmlessness, comprehensiv eness, intent capture, and helpfulness. Additionally, the stability of artificia l intelligence responses was also assessed by submitting each question 3 times u nder consistent conditions. Google Bard excelled in readability but lagged in ap propriateness when compared to human experts (mean 8.51, SD 0.37 vs mean 9.60, S D 0.33; P=.03). ChatGPT-3.5 and ChatGPT-4, however, performed comparably with hu man experts in terms of appropriateness (mean 8.96, SD 0.35 and mean 9.34, SD 0. 47, respectively), with ChatGPT-4 demonstrating the highest stability and reliab ility. Furthermore, all 3 LLMs received superior harmlessness scores comparable to human experts, with lay users finding minimal differences in helpfulness and intent capture between the artificial intelligence models and human responses. L LMs, particularly ChatGPT-4, show potential in oral health care, providing patie nt-centric information for enhancing patient education and clinical care. The ob served performance variations underscore the need for ongoing refinement and eth ical considerations in health care settings.”

ShanghaiPeople’s Republic of ChinaAs iaArtificial IntelligenceEmerging TechnologiesMachine Learning

2024

Robotics & Machine Learning Daily News

Robotics & Machine Learning Daily News

ISSN:
年,卷(期):2024.(MAY.8)