Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.|Articles

Dr AI Will See You Now -- Generative AI Doesn’t Improve Diagnostic Accuracy of Physicians

Background: Diagnostic errors in medicine are common and can result in significant patient harm. Artificial intelligence (AI) technologies have long been postulated to create tools to assist with diagnostic accuracy. Objective: To compare the diagnostic reasoning performance of physicians using the aid of a commercial large language model (LLM) chatbot (ChatGPT Plus) compared to conventional diagnostic resources such as UpToDate and Google. Design: Single-blinded, randomized, multicenter study with stratified randomization. Participants: 50 licensed attending and resident physicians in internal medicine, family medicine, and emergency medicine were recruited from 3 separate institutions. Methods: Participants were randomized to either the LLM interface group where participants were provided access to ChatGPT-4 to add in determining their diagnosis or to a control group. Clinical vignettes were based on actual patients and included information based on initial diagnostic evaluation, i more...

Want to read the full article?

To view, you must be an active Practical Reviews subscriber.

Login

Dr AI Will See You Now -- Generative AI Doesn’t Improve Diagnostic Accuracy of Physicians

Want to read the full article?