
Study Outlines Machine-Based Model for Identifying Drug-Resistant Bacteria
Investigators use Salmonella strains to test accuracy and resistance.
With
A recent study suggests a solution to this challenge may be on the horizon.
In a collaboration between computer scientists, engineers, and infectious disease specialists (among others), models using whole genome sequence data to predict minimum inhibitory concentrations (MICs) for nontyphoidal Salmonella in strains collected and sequenced via the National Antimicrobial Resistance Monitoring System between 2002 and 2016 have emerged. The team published their
Investigators could not be reached for comment on deadline; however, in their concluding remarks, they noted, “In this study, we have built machine learning-based MIC prediction models for nontyphoidal Salmonella genomes using XGBoost that achieve overall accuracies of 95% to 96%... To our knowledge, this is one of the largest and most accurate MIC prediction models to be published to date. Importantly, it provides a model strategy for performing MIC prediction directly from genome sequence data that could be applied to other human or veterinary pathogens.”
The investigators, who have also developed a similar approach for
To evaluate the accuracy of their model, the team used a collection of 5278 nontyphoidal Salmonella genomes to generate XGBoost-based machine learning models for MICs for 15 antibiotics: ampicillin, amoxicillin/clavulanic acid, ceftriaxone, azithromycin, chloramphenicol, ciprofloxacin, trimethoprim/sulfamethoxazolem, sulfisoxazole, cefoxitin, gentamicin, kanamycin, nalidixic acid, streptomycin, tetracycline, and ceftiofur. The MIC prediction models—tested by performing 10-fold cross validations—were found to have an overall average accuracy of 95% within ±1 2-fold dilution step, an average very major error (VME) rate of 2.7%, and an average major error (ME) rate of 0.1%.
“The model predicts MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains,” they wrote. “By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs.”
In a related
Newsletter
Stay ahead of emerging infectious disease threats with expert insights and breaking research. Subscribe now to get updates delivered straight to your inbox.