Skip to content

ChatGPT passes radiology board exam — ScienceDaily

Featured Sponsor

Store Link Sample Product
UK Artful Impressions Premiere Etsy Store


The latest version of ChatGPT passed a radiology dashboard-style exam, highlighting the potential of big language models but also revealing limitations that hinder reliability, according to two new research studies published in Radiologya journal of the Radiological Society of North America (RSNA).

ChatGPT is an artificial intelligence (AI) chatbot that uses a deep learning model to recognize patterns and relationships between words in its vast training data to generate human-like responses based on a prompt. But since there is no source of truth in your training data, the tool can generate answers that are factually wrong.

“The use of long language models like ChatGPT is skyrocketing and will continue to rise,” said lead author Rajesh Bhayana, MD, FRCPC, abdominal radiologist and technology lead at University Medical Imaging Toronto, Toronto General Hospital in Toronto, Canada. “Our research provides insights into the performance of ChatGPT in a radiology context, highlighting the incredible potential of large language models, along with current limitations that make it unreliable.”

ChatGPT was recently named the fastest-growing consumer app ever, and similar chatbots are being incorporated into popular search engines like Google and Bing that doctors and patients use to find medical information, Dr. Bhayana noted.

To assess their performance on the radiology board exam questions and explore strengths and limitations, Dr. Bhayana and colleagues first tested ChatGPT based on GPT-3.5, currently the most widely used version. The researchers used 150 multiple-choice questions designed to match the style, content, and difficulty of the Canadian Royal College and American Board of Radiology exams.

The questions did not include images and were grouped by question type to obtain information on performance: lower order thinking (recall knowledge, basic understanding) and higher order (apply, analyze, synthesize). Higher-order thinking questions were further subclassified by type (description of imaging findings, clinical management, calculation and classification, disease associations).

ChatGPT performance was assessed overall and by question type and topic. Confidence in the language of the responses was also assessed.

The researchers found that ChatGPT based on GPT-3.5 correctly answered 69% of the questions (104 out of 150), close to the 70% passing grade used by the Royal College in Canada. The model performed relatively well on questions that required lower-order thinking (84%, 51 of 61), but struggled with questions that required higher-order thinking (60%, 53 of 89). More specifically, he had trouble with higher order questions related to describing imaging findings (61%, 28 of 46), calculation and classification (25%, 2 of 8), and application of concepts (30%, 3 of 10). . His poor performance on higher-order thinking questions was not surprising given his lack of specific prior training in radiology.

GPT-4 was released in March 2023 on a limited basis for paid users, specifically claiming to have improved advanced reasoning capabilities compared to GPT-3.5.

In a follow-up study, GPT-4 correctly answered 81% (121 of 150) of the same questions, outperforming GPT-3.5 and passing the 70% pass threshold. GPT-4 performed much better than GPT-3.5 on higher order thinking questions (81%), more specifically those related to describing imaging findings (85%) and applying concepts (90%).

The findings suggest that GPT-4’s enhanced advanced reasoning abilities translate into enhanced performance in a radiology context. They also suggest a better contextual understanding of radiology-specific terminology, including image descriptions, which is critical to enable future downstream applications.

“Our study demonstrates an impressive improvement in the performance of ChatGPT in radiology in a short period of time, highlighting the growing potential of large language models in this context,” said Dr. Bhayana.

GPT-4 showed no improvement in lower order thinking questions (80% vs. 84%) and incorrectly answered 12 questions that GPT-3.5 answered correctly, raising questions regarding its reliability in collecting information.

“We were initially surprised by ChatGPT’s accurate and confident answers to some challenging radiological questions, but then we were equally surprised by some very illogical and inaccurate statements,” said Dr. Bhayana. “Of course, given the way these models work, inaccurate answers shouldn’t be particularly surprising.”

ChatGPT’s dangerous tendency to produce inaccurate responses, called hallucinations, is less prevalent in GPT-4, but still limits usability in education and medical practice today.

Both studies showed that ChatGPT consistently used safe language, even when it was incorrect. This is particularly dangerous if it’s just used for information, Dr. Bhayana notes, especially for novices who may not recognize sure wrong answers as inaccurate.

“To me, this is its biggest limitation. Today, ChatGPT is best used to generate ideas, help start the medical writing process, and summarize data. If it’s used to quickly retrieve information, it should always be checked,” said Dr. said Bhayana.


—————————————————-

Source link

We’re happy to share our sponsored content because that’s how we monetize our site!

Article Link
UK Artful Impressions Premiere Etsy Store
Sponsored Content View
ASUS Vivobook Review View
Ted Lasso’s MacBook Guide View
Alpilean Energy Boost View
Japanese Weight Loss View
MacBook Air i3 vs i5 View
Liberty Shield View
🔥📰 For more news and articles, click here to see our full list. 🌟✨

👍🎉 Don’t forget to follow and like our Facebook page for more updates and amazing content: Decorris List on Facebook 🌟💯

📸✨ Follow us on Instagram for more news and updates: @decorrislist 🚀🌐

🎨✨ Follow UK Artful Impressions on Instagram for more digital creative designs: @ukartfulimpressions 🚀🌐

🎨✨ Follow our Premier Etsy Store, UK Artful Impressions, for more digital templates and updates: UK Artful Impressions 🚀🌐