Vocational School of Health Services Reports Findings in Artificial Intelligence (Artificial intelligence in reproductive endocrinology: an in-depth longitudina l analysis of ChatGPTv4’s month-by-month interpretation and adherence to clinica l ...)

扫码查看

Abstract

By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News – New research on Artificial Intelligenc e is the subject of a report. According to news reporting out of Istanbul, Turke y, by NewsRx editors, research stated, “To quantitatively assess the performance of ChatGPTv4, an Artificial Intelligence Language Model, in adhering to clinica l guidelines for Diminished Ovarian Reserve (DOR) over two months, evaluating th e model’s consistency in providing guideline-based responses. A longitudinal stu dy design was employed to evaluate ChatGPTv4’s response accuracy and completenes s using a structured questionnaire at baseline and at a two-month follow-up.” Our news journalists obtained a quote from the research from the Vocational Scho ol of Health Services, “ChatGPTv4 was tasked with interpreting DOR questionnaire s based on standardized clinical guidelines. The study did not involve human par ticipants; the questionnaire was exclusively administered to the ChatGPT model t o generate responses about DOR. A guideline-based questionnaire with 176 open-en ded, 166 multiple-choice, and 153 true/false questions were deployed to rigorous ly assess ChatGPTv4’s ability to provide accurate medical advice aligned with cu rrent DOR clinical guidelines. AI-generated responses were rated on a 6-point Li kert scale for accuracy and a 3-point scale for completeness. The two-phase desi gn assessed the stability and consistency of AI-generated answers over two month s. ChatGPTv4 achieved near-perfect scores across all question types, with true/f alse questions consistently answered with 100% accuracy. In multip le-choice queries, accuracy improved from 98.2 to 100% at the two- month follow-up. Open-ended question responses exhibited significant positive en hancements, with accuracy scores increasing from an average of 5.38 ± 0.71 to 5. 74 ± 0.51 (max: 6.0) and completeness scores from 2.57 ± 0.52 to 2.85 ± 0.36 (ma x: 3.0). It underscored the improvements as significant (p <0.001), with positive correlations between initial and follow-up accuracy (r = 0.597) and completeness (r = 0.381) scores. The study was limited by the relianc e on a controlled, albeit simulated, setting that may not perfectly mirror real- world clinical interactions. ChatGPTv4 demonstrated exceptional and improving ac curacy and completeness in handling DOR-related guideline queries over the studi ed period.”

Key words

Istanbul/Turkey/Eurasia/Artificial In telligence/Emerging Technologies/Endocrinology/Gynecology/Health and Medicin e/Machine Learning/Women’s Health

引用本文复制引用

出版年

2024

Robotics & Machine Learning Daily News

ISSN：

段落导航