The Pathogenwatch AMR prediction library and pipeline.
The Pathogenwatch antimicrobial resistance (AMR) prediction module combines bespoke tested libraries of potentially complex genotypes and linked phenotypes with software for searching microbial genome assemblies and inferring resistance profiles. The results are visualised in the AMR tables of the Genome Reports and Collections Views with tables highlighting presence / absence of genes and SNPs implicated in antimicrobial resistance and their combined effect.
The resistance element databases have been developed in-house for each specific species. Collections were built from compiling literature searches, personal communications, and publicly available databases such CARD, ResFinder, and the NCBI. Each gene or mutations direct contribution is then further verified using available experimental resistance data linked to genomic sequence. A key focus is ensuring false positives are not generated.
Comparison with other methods for genome-based resistance prediction and experimental data find good agreement between most methods (results to be published).
- Genes or mutations are grouped into “sets” according to how they combine to confer resistance for each antibiotic.
- Most sets consist of a single gene or mutation - i.e. often only a single mutation is required to confer resistance. It is possible for a single gene or SNP to belong to more than one set, for instance if they can confer resistance to more than one antibiotic.
- Identified resistance elements are aggregated into their sets and the sets segregated in "complete" and "partial".
- Complete sets can confer high/R or intermediate/I resistance, while partial sets confer either intermediate resistance or none.
- Resistance can also be inducible/Ind (meaning that treatment with one antibiotic can confer resistance to another antibiotic) depending on the presence of extra elements.
- Other modifier elements may cause a resistant set to become sensitive.
Each library consists of representative DNA sequences for known antibiotic associated genes, whether through presence of the gene itself or the presence of variants within that gene itself. The sequences searched and variants extracted. These are then profiled against the mechanisms in the database, and the outputs resolved to create the final phenotype profile.
AMR Libraries are available via GitLab at the following link: https://gitlab.com/cgps/pathogenwatch/amr-libraries.
Full descriptions of the format and possible mechanisms are described in https://gitlab.com/cgps/pathogenwatch/amr-libraries/blob/master/FORMAT.md.
- The representative sequences are collated into a FASTA file and a BLAST library prepared.
- Query assembly FASTA files are searched against the library using blastn.
- Library sequence coverage and percent identity thresholds are set for each gene individually, as established by the curators.
- If two matches overlap by more than 100 nucleotides, then the one with the highest percent identity to the reference sequence is selected.
- Variants are extracted and translated for protein-encoding sequences.
- Selected matches and variants are then collated into the resistance sets and the sets segregated into “complete” and “partial”.
- The Staphylococcus aureus library was developed by Corinna Glasner, Sandra Reuter and Corin Yeats.
- The Neisseria gonorrhoeae library was initially developed by Simon Harris and the EuroGASP consortium. An extensive update was released in April 2020 by Leo Sanchez and Corin Yeats in collaboration with Michelle Cole (Public Health England, UK), Yonatan H. Grad (Harvard TH Chan School of Public Health, Boston, USA), Irene Martin (Public Health Agency of Canada), William M. Shafer (Emory Antibiotic Resistance Center, Atlanta, USA), Gianfranco Spiteri (European Centre for Disease Prevention and Control, Sweden), Katy Town (Centre for Disease Prevention and Control, Atlanta, USA), Magnus Unemo (WHO Collaborating Centre for Gonorrhoea and other STIs, Örebro University, Sweden), Teodora Wi (World Health Organization, Geneva).
- The Salmonella Typhi library was developed and maintained up to March 2023 by Silvia Argimon and Corin Yeats, while the CPE, ESBL and Colistin libraries were contributed by Sandra Reuter, Sophia David and Silvia Argimon.
- The Streptococcus pneumoniae library was contributed by Rebecca Gladstone and Stephen Bentley. A substantial update was developed and released in April 2020 by Stephanie Lo and Corin Yeats.
- The Vibrio cholerae library was developed by Sina Beier and Corin Yeats, and released in November 2021. It was substantially revised by Avril Coghlan and Corin Yeats in Februrary 2023. The Candida auris library was developed by Silvia Argimon, Corin Yeats, and Johanna Rhodes and released in March 2023.
- The Candida auris library was developed by Silvia Argimon, Corin Yeats, and Johanna Rhodes and released in March 2023.