Artificial intelligence can be a useful tool for health professionals and researchers when interpreting diagnostic images. While a radiologist can identify fractures and other abnormalities from an X-ray, AI models can see patterns that humans cannot, offering the opportunity to expand the effectiveness of medical imaging.
But a study in Scientific Reports highlights a hidden challenge of using AI in medical imaging research: the phenomenon of highly accurate but potentially misleading results known as “shortcut learning.”
Researchers analyzed more than 25,000 knee . While these predictions have no medical basis, the models achieved surprising levels of accuracy by exploiting subtle, unintended patterns in the data.
“While AI has the potential to transform medical imaging, we must be cautious,” says the study’s senior author, Dr. Peter Schilling, an orthopedic surgeon at Dartmouth Health’s Dartmouth Hitchcock Medical Center and assistant professor of orthopedics at the College of Dartmouth Geisel Medicine. .
“These models can see patterns that humans can’t, but not all of the patterns they identify are meaningful or reliable,” Schilling says. “It is crucial to recognize these risks to avoid misleading conclusions and ensure scientific integrity.”
The researchers examined how AI algorithms often rely on confounding variables (such as differences in X-ray equipment or clinical site markers) to make predictions rather than medically meaningful characteristics. Attempts to eliminate these biases were only marginally successful: AI models would simply “learn” other hidden data patterns.
“This goes beyond race or gender cue bias,” says Brandon Hill, a co-author of the study and a machine learning scientist at Dartmouth Hitchcock. “We found that the algorithm could even learn to predict the year an “They should be aware of how easily this happens when using this technique.”
The findings underscore the need for rigorous evaluation standards in AI-based medical research. Overreliance on standard algorithms without deeper scrutiny could lead to erroneous clinical insights and treatment pathways.
“The burden of proof increases greatly when it comes to using models for the discovery of new patterns in medicine,” says Hill. “Part of the problem is our own bias. It’s incredibly easy to fall into the trap of assuming the model ‘sees’ the same thing we do. In the end, it doesn’t.”
“AI is almost like dealing with extraterrestrial intelligence,” Hill continues. “You want to say that the model cheats, but that anthropomorphizes the technology. It learned a way to solve the task assigned to it, but not necessarily how a person would do it. It has no logic or reasoning as we normally understand it.”
Schilling, Hill and study co-author Frances Koback, a third-year medical student at Dartmouth’s Geisel School, conducted the study in collaboration with the Veterans Affairs Medical Center in White River Junction, Vermont.