Stanford Medicine researchers have created an artificial intelligence tool that can read thousands of medical notes in electronic medical records and detect trends, providing information that doctors and researchers hope will improve care.
Typically, experts seeking answers to questions about care must pore over hundreds of medical records. But new research shows that large language models—artificial intelligence tools that can find patterns in complex written language—can take over this tedious work, and that their findings could have practical uses. For example, AI tools could monitor patient records for mentions of dangerous drug interactions or could help doctors identify patients who will respond well or poorly to specific treatments.
The AI tool, described in a study published online Dec. 19 in Pediatricswas designed to determine from medical records whether children with attention-deficit/hyperactivity disorder received appropriate follow-up care after being prescribed new medications.
“This model allows us to identify some gaps in the treatment of ADHD,” said the study’s senior author, Yair Bannett, MD, assistant professor of pediatrics.
The study’s lead author is Heidi Feldman, MD, Ballinger-Swindells professor of developmental and behavioral pediatrics.
The research team used insights from the tool to identify tactics that could improve how doctors follow ADHD patients and their families, Bannett said, adding that the power of such AI tools could be applied to many aspects of medical care.
Hard work for a human, a breeze for AI
Electronic medical records contain information such as lab results or blood pressure measurements in a format that computers can easily compare across many patients. But everything else (about 80% of the information in any medical record) is in the notes that doctors write about the patient’s care.
Although these notes are useful to the next human being who reads a patient’s chart, their free-form sentences are difficult to parse. mass. This less organized information must be categorized before it can be used for research, usually by a person reading the notes for specific details. The new study looked at whether researchers could employ artificial intelligence for that task.
The study used medical records for 1,201 children who were between 6 and 11 years old, were patients at 11 pediatric primary care practices in the same health care network, and had a prescription for at least one ADHD medication. Such medications can have disturbing side effects, such as suppressing a child’s appetite, so it is important for doctors to ask about side effects when patients first use the medications and adjust doses as necessary.
The team trained an existing large language model to read medical notes, looking to see if children or their parents were asked about side effects in the first three months of taking a new medication. The model was trained with a set of 501 notes that the researchers reviewed. The researchers counted any note that mentioned the presence or absence of side effects (e.g., “reduced appetite” or “no weight loss”) as an indication that follow-up had been performed, while notes with no mention of side effects were counted as meaning no follow-up had been performed.
These human-reviewed notes were used as what in AI is known as a “ground truth” for the model: the research team used 411 of the notes to teach the model what a side-effect consultation was like, and the remaining 90 notes to verify that the model could accurately find such queries. They then manually reviewed 363 additional notes and tested the model’s performance again, finding that it classified approximately 90% of the notes correctly.
Once the large language model worked well, the researchers used it to quickly evaluate the 15,628 notes in patient charts, a task that would have taken more than seven months of full-time work without the AI.
From analysis to better care
From the AI analysis, the researchers collected information that they would not have otherwise detected. For example, the AI saw that some pediatric practices frequently asked about medication side effects during phone conversations with patients’ parents, while other practices did not.
“That’s something you would never be able to detect if you didn’t implement this model on 16,000 bills like we did, because no human being would sit down and do that,” Bannett said.
The AI also found that pediatricians asked follow-up questions about certain medications less frequently. Children with ADHD may be prescribed stimulants or, less commonly, non-stimulant medications, such as some types of anti-anxiety medications. Doctors were less likely to ask about the latter category of medications.
The finding offers an example of the limits of what AI can do, Bannett said: It could detect a pattern in patient records but not explain why the pattern was there.
“We really had to talk to pediatricians to understand this,” he said, noting that pediatricians told him they had more experience managing the side effects of stimulants.
The AI tool may have missed some queries about drug side effects in its analysis, the researchers said, because some conversations about side effects may not have been recorded in patients’ electronic medical records, and some patients received specialized care, such as with a psychiatrist, which was not recorded in the medical records used in this study. The AI tool also misclassified some medical notes about the side effects of prescriptions for other conditions, such as acne medications.
Guiding AI
As scientists build more AI tools for medical research, they must consider what works well and what they do poorly, Bannett said. Some tasks, such as sorting thousands of medical records, are ideal for a properly trained AI tool.
Others, such as understanding the ethical pitfalls of the medical landscape, will require careful human thinking, he said. An editorial that Bannett and his colleagues recently published in Hospital Pediatrics explains some of the potential problems and how they could be addressed.
“These AI models are based on existing healthcare data, and we know from many studies over the years that healthcare disparities exist,” Bannett said. Researchers need to think about how to mitigate those biases both when building AI tools and putting them into operation, he said, adding that, with the right precautions, he’s excited about the potential for AI to help doctors do their jobs. . better.
“Each patient has their own experience and the doctor has their knowledge base, but with AI I can put the knowledge of large populations at their fingertips,” he said. For example, AI could eventually help doctors predict, based on a patient’s age, race/ethnicity, genetic profile, and combination of diagnoses, whether the individual is likely to have a negative side effect from a specific medication, he said. “That can help doctors make personalized decisions about medical treatment.”
The research was supported by the Stanford Maternal and Child Health Research Institute and the National Institute of Mental Health (grant K23MH128455).