Skip to content

We May Be Overestimating Association Between Gut Bacteria and Disease, Machine Learning Study Finds

Many bacteria-related diseases, such as inflammatory bowel disease or colorectal cancer, are associated with an overgrowth of gut bacteria that are thought to be bad. But when researchers used a machine learning algorithm to predict the density of microbes (called microbial load) from their gut microbiomes, they found that changes in microbial load, rather than disease, could be a factor behind the presence of associated microbial diseases. species.

The researchers report November 13, 2024 in the journal Cell Press. Cell that differences in a patient’s microbial load, which was found to be influenced by factors ranging from age, sex, diet, country of origin and antibiotic use, were a key factor for the bacterial signatures in the samples feces, even in patients with diseases.

“We were surprised to find that many microbial species, previously thought to be associated with diseases, were more clearly explained by changes in microbial load,” says Peer Bork of the European Molecular Biology Laboratory (EMBL) in Heidelberg, one of the authors. principals of the study. the study. “This indicates that these species are primarily associated with symptoms such as diarrhea and constipation, rather than being directly related to the diseases themselves.”

Microbial load has long been recognized as an important factor in microbiome research, but large-scale analysis has been largely limited due to the high cost and labor-intensive nature of experimental methods, which the researchers overcame with a machine learning approach. They developed a fecal microbial load prediction model based on the relative composition of the microbiome and applied it to a large-scale metagenomic data set to explore its variation in health and disease.

“Measuring microbial load in fecal samples requires a lot of effort and we were pleased to have access to two large metagenomic data sets where microbial load had been measured experimentally,” says Michael Kuhn, also from EMBL and another lead author of the study. “With our approach, we want to generalize these data for the benefit of a broader field and with the tools we provide, the microbial load can be predicted for all studies of the adult human gut microbiome.”

The data sets the team generated for the research are thousands of metagenomes and experimentally measured microbial load in the EU-funded Novo Nordisk Foundation GALAXY (Gut-and-Liver Axis in Alcoholic Liver Fibrosis) and MicrobLiver projects. They also used metagenomes and microbial load data from a previously public MetaCardis study population. For the exploratory data sets, they used tens of thousands of metagenomes from previous studies that included populations from Japan and Estonia.

The team recognizes limitations to the work. Because the analysis was based solely on associations, they were unable to establish a clear direction of causality or provide mechanistic insight. Furthermore, the developed method is only applicable to the human gut microbiome: different training data sets are needed to predict microbial load in other habitats.

Future research will focus on microbial species that are most directly associated with diseases, regardless of microbial load, to better understand their roles in disease etiology and their potential use as biomarkers. Additionally, adapting this prediction model to other environments, such as ocean and soil microbiomes, could provide more insight into microbial ecology on a global scale.