Pathogenwatch AMR

The Pathogenwatch AMR prediction library and pipeline.

The Pathogenwatch AMR prediction module

The Pathogenwatch antimicrobial resistance (AMR) prediction module combines bespoke tested libraries of potentially complex genotypes and linked phenotypes with software for searching microbial genome assemblies and inferring resistance profiles. The results are visualised in the AMR tables of the Genome Reports and Collections Views with tables highlighting presence / absence of genes and SNPs implicated in antimicrobial resistance and their combined effect.

The resistance element databases have been developed in-house for each specific species. Collections were built from compiling literature searches, personal communications, and publicly available databases such as ResFinder, and the NCBI. Each gene or mutations direct contribution is then further verified using available experimental resistance data linked to genomic sequence. A key focus is ensuring false positives are not generated.

Comparison with other methods for genome-based resistance prediction and experimental data find good agreement between most methods.

General Rules

  • Genes or mutations are grouped into “sets” according to how they combine to confer resistance for each antibiotic.

  • Most sets consist of a single gene or mutation - i.e. often only a single mutation is required to confer resistance. It is possible for a single gene or SNP to belong to more than one set, for instance if they can confer resistance to more than one antibiotic.

  • Identified resistance elements are aggregated into their sets and the sets segregated in "complete" and "partial".

  • Complete sets can confer high/R or intermediate/I resistance, while partial sets confer either intermediate resistance or none.

  • Resistance can also be inducible/Ind (meaning that treatment with one antibiotic can confer resistance to another antibiotic) depending on the presence of extra elements.

  • Other modifier elements may cause a resistant set to become sensitive.

Resistance Search Method

Each library consists of representative DNA sequences for known antibiotic associated genes, whether through presence of the gene itself or the presence of variants within that gene itself. The sequences searched and variants extracted. These are then profiled against the mechanisms in the database, and the outputs resolved to create the final phenotype profile.

AMR Libraries are available via GitLab at the following link: https://github.com/pathogenwatch-oss/amr-libraries.

Full descriptions of the format and possible mechanisms are described in https://github.com/pathogenwatch-oss/amr-libraries/blob/main/FORMAT.md.

Search Method

  • The representative sequences are collated into a FASTA file and a BLAST library prepared.

  • Query assembly FASTA files are searched against the library using blastn.

  • Library sequence coverage and percent identity thresholds are set for each gene individually, as established by the curators.

  • If two matches overlap by more than 100 nucleotides, then the one with the highest percent identity to the reference sequence is selected.

  • Variants are extracted and translated for protein-encoding sequences.

  • Selected matches and variants are then collated into the resistance sets and the sets segregated into “complete” and “partial”.

Validation

All the libraries used by Pathogenwatch have been validated against phenotyped strains.

It should be noted that AMR detection is a moving target and for many bug/drug combinations there is not enough diversity or depth of sampling to provide globally accurate precision values. Hence, we often have to rely on literature and expert review of included mechanism rather than a detailed comparison of expected phenotype against actual. We always try to ensure that false positives are kept low, though for some common resistance genes the phenotype penetration is only partial due to unknown confounding factors, such as only becoming active after exposure to a different antibitotic (induced resistance). In these cases we have kept the prediction as they are regarded as significant to public health.

N. gonorrhoeae: "A community-driven resource for genomic epidemiology and antimicrobial resistance prediction of Neisseria gonorrhoeae at Pathogenwatch"

S. Typhi: "A global resource for genomic predictions of antimicrobial resistance and surveillance of Salmonella Typhi at pathogenwatch"

S. aureus: "Whole-Genome Sequencing for Routine Pathogen Surveillance in Public Health: a Population Snapshot of Invasive Staphylococcus aureus in Europe"

Unpublished

S. pneumoniae: This library was extensively tested on AMR phenotyped genomes in collaboration with the Global Pneumococcal Sequencing Survey (GPS).

V. cholerae: Manuscript in preparation

C. auris: Manuscript in preparation.

E. faecium: Comparison on the PW AMR pipeline (assembled genomes) against the original Ariba-based pipeline (short reads). Manuscript on the original pipeline under review.

PW AMR
PW AMR
Ariba
Ariba

Antibiotic

Sensitivity

Specificity

Sensitivity

Specificity

Ampicillin

99.86

97.94

99.82

97.94

Ciprofloxacin

98.00

98.84

98.00

98.84

Clindamycin

96.73

62.92

97.06

62.60

Daptomycin

75.47

91.98

76.42

91.98

Doxycycline

96.91

83.33

96.91

81.48

Erythromycin

93.86

97.87

94.53

97.87

Gentamicin

95.71

83.83

96.83

82.24

Kanamycin

98.85

68.30

99.48

66.06

Linezolid

34.09*

98.94

100.0

98.18

Quinupristin_Dalfopristin

86.94

94.38

89.17

94.51

Streptomycin

95.56

90.02

97.82

88.60

Tetracycline

91.22

91.00

99.52

70.96

Teicoplanin

97.83

94.47

98.99

93.44

Tigecycline

41.67

99.74

38.33

99.74

Vancomycin

97.74

99.44

98.74

98.80

* Linezolid resistance is often caused by SNPs in the 23S RNA gene. Since this gene exists as multiple highly similar copies, often only one version is constructed during genome assembly, which can be missing the determinant SNPs. Hence, in this case sensitivity is much lower for short read genome assemblies.

Acknowledgements

  • The Staphylococcus aureus library was developed by Corinna Glasner, Sandra Reuter and Corin Yeats.

  • The Neisseria gonorrhoeae library was initially developed by Simon Harris and the EuroGASP consortium. An extensive update was released in April 2020 by Leo Sanchez and Corin Yeats in collaboration with Michelle Cole (Public Health England, UK), Yonatan H. Grad (Harvard TH Chan School of Public Health, Boston, USA), Irene Martin (Public Health Agency of Canada), William M. Shafer (Emory Antibiotic Resistance Center, Atlanta, USA), Gianfranco Spiteri (European Centre for Disease Prevention and Control, Sweden), Katy Town (Centre for Disease Prevention and Control, Atlanta, USA), Magnus Unemo (WHO Collaborating Centre for Gonorrhoea and other STIs, Örebro University, Sweden), Teodora Wi (World Health Organization, Geneva).

  • The Salmonella Typhi library was developed and maintained up to March 2023 by Silvia Argimon and Corin Yeats, while the CPE, ESBL and Colistin libraries were contributed by Sandra Reuter, Sophia David and Silvia Argimon.

  • The Streptococcus pneumoniae library was contributed by Rebecca Gladstone and Stephen Bentley. A substantial update was developed and released in April 2020 by Stephanie Lo and Corin Yeats.

  • The Vibrio cholerae library was developed by Sina Beier and Corin Yeats, and released in November 2021. It was substantially revised by Avril Coghlan and Corin Yeats in Februrary 2023.

  • The Candida auris library was developed by Silvia Argimon, Corin Yeats, and Johanna Rhodes and released in March 2023.

  • The Enterococcus faecium library was developed by Francesc Coll I Cerezo and Theodore Gouliouris (Coll et al, 2024) for use with the Ariba software (pipeline published on GitHub). It was adapted for Pathogenwatch by Corin Yeats in collaboration with Francesc Coll I Cerezo in August 2023.

Last updated