πŸŽ‰Announcements

News about upcoming changes or down time.

20-21st January 2024

cgMLST profiles and clustering update

This weekend we will be updating the cgMLST profiles and cgMLST-based clustering for all Pathogenwatch genomes. This update brings our cgMLST assignment in line with the community standard set by PubMLST and should lead to improvements in the clustering. The site is expected to remain available at all times, but the processing queue may be closed for most of the weekend. We apologies in advance for any inconvenience.

3rd November

992 new Vibrio cholerae genomes from Vibriowatch

We are please to announce as part of the Vibriowatch project, we have added 992 more genomes to the public collection, making a total of 5,671 assembled genomes. We have also added a further 15 collections and updated one. Thanks to Avril Coghlan for manually curating and collating everything.

22nd September

cgMLST profiles and clustering update

This weekend we will be updating the cgMLST profiles and cgMLST-based clustering for all Pathogenwatch genomes. This update brings our cgMLST assignment in line with the community standard set by PubMLST and should lead to improvements in the clustering. The site is expected to remain available at all times, but the processing queue may be closed for most of the weekend. We apologies in advance for any inconvenience.

22nd September

1662 new public Candida auris genomes.

The C. auris public genome collection has more than doubled in size to a total of 2,680 assembled genomes. New genomes with Illumina paired end reads and either location or time metadata were downloaded from the ENA on the 21st of June and assembled at the CGPS.

All C. auris collections have been updated to add the new public genomes to the subtrees.

19th September

Updates to Vibrio cholerae.

We have released four new collections: Alam et al (2022); Angemeyer et al (2022); Irenge et al (2020); Wang et al (2020) of Vibrio cholerae, along with 57 newly public genomes, as part of the Vibriowatch project.

20th June

Updates to Klebsiella species

Firstly, we identified around 200 genomes which were erroneously included in the update on the 26th April, and should have been removed during the QC step. These genomes have been removed from the public collection, but will still be visible in collections that have already been created.

Secondly, we have assembled and made available another 800 genomes that published by the ENA from April 2022 to March 2023, along with linked metadata including sample date and location.

As a result, there are now 12 Klebsiella africana, 1306 Klebsiella quasipneumoniae, 30 Klebsiella quasivariicola, and 1270 Klebsiella variicola assemblies available on Pathogenwatch.

9th June

Disruption due to update

We expect to deploy an update to the Candida auris core scheme on either Friday 4-5pm or Monday 4-5pm. This may cause some brief disruption while C. auris collections are rebuilt, and may mean some of these will not display correctly until this has been completed.

18th May

271 new public Salmonella Typhi genomes from South Africa

Thanks to our KlebNET collaboration, we have a new collection of 271 Salmonella Typhi assemblies with detailed curated annotation provided by the National Institute of Communicable Disease of South Africa (NICD).

Visit the new collection via this link: https://pathogen.watch/collection/iub5by3x15ba-south-africa-nicd-typhi.

10th May

Queues will be closed for 2 hours from 3pm (UK)

We have a major update to the genome clustering process today that will necessitate closing the processing queues for up to 2 hours. The website should remain available throughout this time, though accessing clusters from genome reports may not be possible.

9th May

Downloading the public data is now easier

We've made it simpler to access complete downloads of the public data sets, including FASTAs, metadata and computed annotations. For more information, please visit the documentation pages "Public data downloads".

3rd May

N. gonorrhoeae annotation fix

We identified 32 public genomes that didn't have their name fields set. These are now fixed and should correctly appear in searches and collections.

26th April

More Klebsiella genomes added from the Kpn-complex

We have added 1,025 new Klebsiella genomes from the Kpn-complex to Pathogenwatch. The new additions include 578 K. variicola, 435 K. quasipneumoniae, 8 K. quasivariicola, and 4 K. africans.

21st April

Two new collections of Salmonella Typhi added

Thanks to our collaborators in TyphiNET we have 749 new genomes with detailed annotation based on two recent surveillance studies. Follow the links below to view them.

"Genomic epidemiology and antimicrobial resistance transmission of Salmonella Typhi and Paratyphi A at three urban sites in Africa and Asia", Dyson et al 2023.

"The rapid emergence of Salmonella Typhi with decreased ciprofloxacin susceptibility following an increase in ciprofloxacin prescriptions in Blantyre, Malawi", Ashton et al 2023

7th March

Phylogentic trees, population search and AMR released for Candida auris

We are pleased to announce official support C. auris as the first fungal pathogen in Pathogenwatch. In collaboration with Matthew Fisher and Johanna Rhodes, we have created an in-house core scheme, allowing the generation of phylogenetic trees, along with a resistance genotyping scheme. We have also included 1018 public genomes with sample date and locaiton metadata, as well as the five complete reference genomes. These genomes will be included in the population subtrees in C. auris collections.

10th February

4,738 V. cholerae genomes now available

The Vibriowatch consortium have added a further 4,272 manually selected and annotated Vibrio cholerae assembled genomes to the public collection. Thanks especially to Avril Coghlan from Nick Thompson's team.

8th February

Latitude-longitude coordinates corrected for public N. gonorrhoeae

We have corrected a large number of inferred coordinates in the new set of N. gonorrhoeae public genomes (released 23rd November). In all cases the location reported in the metadata was correct. However, due to the polymorphic nature of the location descriptions, the geocoder we use to infer the lat-long coordinates was severely confused. The locations have been manually reviewed and fixed, and we expect now full agreement between the stated country and inferred coordinates.

31st January

7089 Salmonella Typhi genomes and 29 collections added to public dataset

This update with 7,089 genomes means that Pathogenwatch now holds the largest collection of curated S. Typhi genome assemblies to date (11,998 assembled genomes in all). Assemblies and linked metadata were provided by the Global Typhoid Genomics Consortium. Metadata provided includes date, country of isolation, country of origin inferred from with travel information, isolation source (i.e. blood, sputum), and purpose of sampling (targeted or non-targeted), along with links to related records in the ENA. For more information please see https://www.medrxiv.org/content/10.1101/2022.12.28.22283969v.

30th January

Salmonella Typhi update news

We are expecting to remove subtrees and update the public Salmonella Typhi public genome database tomorrow.

12th January

Salmonella Typhi subtrees to be removed

In order to allow the expansion of the public Typhi genome database we have to remove the "subtrees" generated for Typhi collections. Instead, we recommend using the clustering tool to identify genomic neighbours in the public and your uploaded genomes. We expect to remove this functionality early next week.

11th January

Campylobacter clusters not yet available.

We are having an issue with getting the Campylobacter clustering to run properly since the recent cgMLST update. We hope to have this resolved soon as no core changes have been made since the last time.

Note: This issue was resolved as of ~5pm UK time.

9th January 2023

Uploads temporarily suspended.

Unfortunately a release failed on Friday evening and we have to rerun it this morning. It will take a few hours as there are a lot of results to process, during which uploads, tree building and clustering will be suspended. We apologies for any inconvenience and any issues you may have experienced since Friday.

Announcements from 2022 below.

20th December

V. cholerae public genomes updated

We have deployed the first 449 genome slice of a curated collection of Vibrio cholerae assemblies from the Vibriowatch project. As part of this update we have removed the old public genome set, most of which should be replaced as part of the future updates. We have also added two new public collections based on the papers these genomes are described in.

12th December

New S. aureus genomes available

We have added 36,141 new genomes to the S. aureus public data set. This has necessitated removing subtrees from the S. aureus collections (see below). If you need to access one of these trees from an old collection, please contact us as all data is still present in the database. All genomes are available for search via the cgMLST profile clustering.

S. aureus removal of duplicate genomes

While preparing an update of over 35,000 new genomes we identified 146 duplicate samples in the current public data set. All of the duplicates were not included any public collections, so they have been simply removed from the public data. The records still exist, so any private collections that contain them will be still load correctly.

Notification of removal of support for S. aureus subtrees

In order to enable the expansion of the public genome database, unfortunately we will have to disable the generation of reference subtrees in S. aureus collections. Instead, in order to search the genomic neighbourhood of a query genome and create a collection, you can use the cluster search method combined with the "List Genomes" button in the Genome Reports.

6th December

Shigella sonnei/E. coli species assignment correction

We've released a new version of Speciator that correctly differentiates the Shigella sonnei and E. coli. The previous version included mislabelled reference genomes causing a dispatate set of E. coli genomes to be identified as S. sonnei.

We are in the process of updating all user genomes to the correct species. We expect this to be complete by the 7th December. The update should have zero impact on any other assignments. Please note, as of the 2nd December update, all the public E. coli, S. sonnei and S. flexneri are correctly identified.

Before correction

After

2nd December

New Shigella sonnei

The 556 previous S. sonnei records have been removed and replaced with 13,211 new genomes with location and sample date have been sourced from ENA. 424 of the previous records are included in the new set, while the rest have been removed due to lack of quality control assessment or species assignment.

E. coli/S. sonnei speciation issue

We are aware that are small percentage (<3%) of E. coli genomes are incorrectly assigned as S. sonnei genomes. We have a new version of speciator that fixes this in final testing and expect to deploy it next week. Previously incorrectly identified genomes will also be corrected and updated.

28th November

New Shigella flexneri genomes available

The previously unrepresented species of S. flexneri now had 9,435 genomes available in the public dataset. These can be searched using the E. coli cgMLST scheme clusters.

New Streptococcus pneumoniae genomes available

14,285 new S. pneumoniae samples have been added to the public data set. We identified 3,671 duplicate samples within the public data, which have been removed. We also removed 701 records that we were unable to verify met our quality control standards. We also replaced the representative assemblies for 1,332 of the previous public data set. In total, there are now 35,604 S. pneumoniae samples represented.

23rd November

New Helicobacter pylori genomes available

There are now 1,038 H. pylori genomes in the public data set - a species that was previously completely unrepresented.

New Enterobacter genomes available

Pathogenwatch now contains 5,699 Enterobacter genomes. 5,667 new genomes were added, replacing 306 of the previous genomes with improved assemblies, while 6 were removed for failing to pass QC.

New Neisseria gonorrhoeae genomes available

The latest data update contains 23,063 new N. gonorrhoeae genomes, bring the total to 38,367.

17th November

New Campylobacter genomes available

Today we have added 21,379 C. coli genomes, and 51,187 C. jejuni genomes. The former's previous 47 public genomes, and the latter's 330 have been removed from the public database. All bar 12 C. jejuni and 2 C. coli genomes are represented in the new public data set. These 14 failed our updated QC thresholds.

14th November

New Haemophilus influenzae genomes available.

The latest update contains 3715 new H. influenzae genomes. The 32 previous genomes were removed.

New Enterococcus faecium genomes available

The latest update contains 15,553 new E. faecium genomes sourced from the ENA. Previously no genomes were available to search via cgMLST.

New Pseudomonas aeruginosa genomes available

As part of the ongoing update to the Pathongenwatch public database, we have assembled and added 13,400 new P. aeruginosa genomes from the ENA. Previously there was one genome, which has been removed.

11th November

New Acinetobacter baumanii genomes available

We have assembled 9,180 A. baumanii genomes, sourced from the ENA, with sample location, and/or date attributes, and other metadata. These have been included in the public data set, and can be searched using the cgMLST clustering. The previous 20 have been removed since it was not possible to identify which run they were built from. They are still represented in the new data set.

New Klebsiella pneumoniae genomes available

We have also assembled an extra 16,556 K. pneumoniae genomes, to make it over 32,000 in the public data set. These have been included in the public data set, and can be searched using the cgMLST clustering. Thanks also to our KlebNet collaborators for their curation work.

14th October

Database slowdown during update

We are planning to deploy an update today that may have some impact on the database performance, and specifically browsing the complete genome list, while it is ongoing. It should take less than an hour, and won't affect processing genomes or viewing collections.

3rd October

Release complete

Apologies for the disruption on the 29th-31st. This was due to two reasons:

  1. The update process was much slower than anticipated.

  2. The initial fix for cgMLST clustering speeds caused an issue with other complex queries to the database, primarily breaking the Genome List view.

Everything should now be working at least as well as before, and cgMLST clustering is a lot quicker and more robust. Please let us know if there's anything we've missed.

29th September

Release day!

The update scheduled for today is a bit of ahead of schedule and is going to begin this morning (UK am). The update should be complete by the end of the day, and we expect the site to remain browse-able, except for a couple of minutes. Please watch the announcements page and release notes for more information.

27th September

Downtime for update expected 29th September

We are currently preparing an update of all genomes to the latest MLST and cgMLST assignments. We expect the update to be ready for this Thursday afternoon (UK time). While we don't expect the website to be inaccessible for more than a minute or two, it is likely that updating the genomes will take a couple of hours and no user tasks will be run during this time - so genome uploads, tree building and clustering will be unavailable.

This update will also include a potential fix the performance issues currently being observed for cgMLST clustering. If there are unexpected issues due to this change, there may also be some more brief disruption as it is reverted.

13th September

Server downtime scheduled for 5pm (UK)

Unfortunately we're still having some issues affecting genome clustering, and we haven't yet identified the root cause. As a result the service will be going down for an estimated 30 minutes at 5pm UK time today. We apologise in advance for any inconvenience.

7th September

Ongoing task processing issues

For an unknown reason, clustering tasks are freezing and causing the task queues to freeze as well. This does not appear to be related to any changes we have made, and could be the result of a 3rd party service provider, but the investigation is ongoing. We apologise for any inconvenience during this time.

26th July

Reduced support in August

Due to staff absences for August holidays, there is likely to be reduced support for answering questions and fixing problems. The site itself is pretty robust and "self-healing" so we don't anticipate any downtime during this period. Please do still get in touch if you do find an issue or want to know something, it may just take a few days before there is a response.

15th July

Public Salmonella Typhi metadata updated

The metadata for the public collection of S. Typhi has been cleaned up and extended with extra fields from the ENA.

17th June

Klebsiella quasipneumoniae and K. variicola trees added

With the latest update, collections of K. quasipneumoniae and K. variicola will have a neighbour-joining tree built using a core genome developed in collaboration with KlebNet. All current collections of those species have been updated.

11th May

EuroGASP 2018 Neisseria gonorrhoeae genomes added

In conjunction with the release of "Europe-wide expansion and eradication of multidrug-resistant Neisseria gonorrhoeae lineages: a genomic surveillance study" by Leonor SΓ‘nchez-BusΓ³ et al. (The Lancet 2022), the 2,375 genomes of the EuroGASP 2018 structural survey have been added to the Pathogenwatch public genomes. You can also view them as an individual collection here: https://pathogen.watch/collection/eurogasp2018. The previous EuroGASP 2013 study can be found here: https://pathogen.watch/collection/eurogasp2013. Congratulations to all involved on an important piece of work.

5th May

Website issue

Resolved: We have noticed that switching to the metadata tab in some collections is causing the collection view to crash. We hope to have this fixed shortly.

28th April

New feature - DOI references

You can now assign literature references to genomes and collections using identifiers from the Digital Object Identification system (DOI), on top of the previous support for Pubmed identifiers. For more information, see the documentation on genome uploads and creating collections.

21st April

Processing Delays

Yesterday we had the convergence of a lot of uploads combined with a configuration error that prevented the scaling up of our compute systems. Apologies for any delays and issues with running tasks. Everything should now be processed.

13th-20th April

Reduced support

Due to holidays in the UK, there will be minimal engineering support for Pathogenwatch. While we don't expect any issues, and the system is pretty good at recovering itself, please be aware that in the event of downtime there may be some delay in bringing it back.

5th-6th April

Upload and processing delays

We are preparing a large data set for future inclusion in Pathogenwatch. Due to time constraints this is likely to have some impact on the speed of processing external tasks over the next 24 hours. We apologise for any disruption and hope to keep it to a minimum.

30th March

About yesterday's downtime and upgrade

Apologies for the extended period that Pathogenwatch was unavailable. There were a couple of mistakes made with regards the robustness of the upgrade process, which we will be correcting for future infrastructure updates. However, the major issue was beyond our control - the developer doing the upgrade (me) lost power and internet (including mobile) for 5 hours, along with a significant chunk of east London, at a critical moment. We hope no one's work was too disrupted

The good news is the database server has been significantly upgraded and the genome list view should now load considerably more reliably.

Last updated