SeroBA
About
These results should not be used for clinical purposes or to inform vaccine programmes. Since the result is based on inference from the DNA sequence rather than a Quellung reaction (gold standard for serotyping) the result may in some cases not match the phenotypic result. However the methodology used by Pathogenwatch based on SeroBA has been shown to have a sensitivity and specificity of 0.98 and 1, respectively (Epping et al 2018).
SeroBA predicts a phenotype starting directly from short read data. Pathogenwatch uses assemblies as the starting genomic data, from which reads are simulated for the purposes of SeroBA. Because of this small difference in methodology 0.14% (28/20049) mismatches are observed between results direct from reads and those from assemblies. These are reported below and a result that may be subject to these differences is flagged with a 'Guidance' link.
Pathogenwatch Serotype
SeroBA Serotype
No. Mismatches (%) [a]
BLAST cps loci Nucleotide Similarity
BLAST cps loci Nucleotide Coverage
Distinguishing Genetic Features [b]
untypable [c]
19A
9 (0.6)
-
-
-
32F
32A
3 (100) [d]
99
99
5 bp gap at the intergenic region between wcrN and the HG272/3 pseudogene
32F
untypable
2 (NA)
-
-
-
33A/33F
33F
2 (1.1)
99.9
92.0
Frameshift mutation insT 433 in 33F wcjE gene
possible 6A
6A
2 (0.2)
-
-
-
11E
11A
1 (0.2)
- [e]
-
Disruption in wcjE
19F
untypable
1 (NA)
-
-
-
32A
32F
1 (14)
99
99
5 bp gap at the intergenic region between wcrN and the HG272/3 pseudogene
32A
untypable
1 (NA)
-
-
-
35A
35C
1 (6.7)
98.9
90
Frameshift mutation insA 248 in wcrK encodes for a GT—consistent with differences in 35A wcrK
6A
possible 6E
1 (NA)
-
-
-
possible 6C
6B
1 (0.09)
99
92
wciNα in 6B / wciNβ in 6C
possible 6D
6C
1 (0.25)
98.6
84
A > G 583 in wciP
possible 6E
6B
1 (0.09)
-
-
-
untypable
23F
1 (0.08)
-
-
-
[a] Percentage is calculated by the number of isolates that mismatched between Pathogenwatch and SeroBA over the total number of isolates for each serotype indicated on the same row typed by SeroBA.
[b] Information extracted from Kapatai et al 2016.
[c] See https://github.com/sanger-pathogens/seroba#troubleshooting. The samples tested here were QC-passed, therefore the untypable results are likely due to low coverage of the cps region.
[d] Only serological analyses can reliably differentiate serotype 32A and 32F. In silico serotype within serogroup 32 is subject to improvement due to the small number of isolates for analysis (Kapatai et al 2016).
[e] Complete sequence of cps loci for 6E and 11E are not available for comparison.
NA = not available
How to cite
Epping L, van Tonder AJ, Gladstone RA, et al. SeroBA: rapid high-throughput serotyping of Streptococcus pneumoniae from whole genome sequence data [published correction appears in Microb Genom. 2018 Aug;4(8). doi: 10.1099/mgen.0.000204]. Microb Genom. 2018;4(7):e000186. doi:10.1099/mgen.0.000186
Last updated