▶️A "Getting Started" Tutorial
Last updated
Last updated
Pathogenwatch provides a platform for comparing pathogen genome assemblies from around the world, integrating diverse data sets with rich representations. To get started, try following the tutorial below, exploring a pre-existing data set. It's also possible to submit your own assemblies - data is kept private unless requested. Instructions for this are provided in the Uploading New Genomes section
While you are free to browse Pathogenwatch anonymously, there is a limit on the number of assemblies you can include within a collection, and you will not be able to upload assemblies for processing. We provide a variety of login methods in the "Hamburger Menu" on the top left, and you will also be prompted to log in if you wish to upload assemblies.
Your account will be created the first time you log in. That account is linked to that login method.
First visit https://pathogen.watch/genomes/all, or click the "Genomes" tab at the top. The bar on the left contains filters that allow you to select a subset of assemblies.
In any view, you can click on the genome's name to pop up a detailed report about it. From here, you can also access cgMLST-based context search, if it's available for that species. This can be used to identify closely related strains.
Select Genus
/ Salmonella
and then select Salmonella enterica
as shown below.
The "Genomes" page lists all the publicly available genomes in Pathogenwatch and provides tools for browsing, filtering and selecting them for further investigation.
Now select the Map
tab. This will bring up a map-based display of all Salmonella enterica assemblies with geographic data. Use the map selection tool (see image below) to select the European assemblies.
You can view basic statistics about the assemblies, such as N50 or GC ratio using the Stats
tab.
Click the Create Collection
button on the top right of the screen, give the collection a title and click Submit
.
Collections are the easiest way of sharing data with colleagues and the general public.
This will bring up the Interactive Collection View of the assemblies you have selected. It will include a core distance-based neighbour-joining tree, embedded in an PhyloCanvas tree browser (top left), the location of the assemblies displayed in the map viewer (top right), a metadata in the table at the bottom, and a Filter Bar at the top centre.
Collection successfully created! In the Collection View you can select assemblies individually or in groups by clicking on nodes in the tree, points on the map, rows in the table, or by creating a filter in the Filter Bar. Selected assemblies are highlighted on the map and tree (as above) and the metadata filtered to that set.
The collection URL is permanent, though the title text is optional. You can bookmark it and return later. It will also be saved into your account if you are logged in.
From here you can explore several aspects of a collection:
Geographic distribution - assemblies are displayed on the map, and a multi-select tool is included, allowing the analysis of their distribution in the tree.
The tree can be manipulated in various ways, including viewing subtrees, changing the layout mode, label and node size. Try increasing the node size to make selected genomes clearer.
Metadata can be selected for display on next to the leaves (assemblies) of the tree by clicking on the column header of interest. Increasing or shrinking the text size in the tree viewer can help with readability.
User-uploaded metadata is displayed in the main tab.
Typing Assignments are provided for each species as applicable. MLST is available for all supported species while species-specific schemes are also provided. These include Genotyphi for Salmonella Typhi (as in this collection) and NG-MAST (Neisseria gonorrhoea-specific.
Antimicrobial Resistance (AMR) predictions from are displayed in the Antibiotics/SNPs/Genes tables accessed via the selector in the corner of the table region. Resistance is given as "Unknown/Intermediate/Resistant -> White/Yellow/Red". The same colours are used to visualise the resistance profiles in the tree and map when selected. Click on different antibiotics and resistance genes to see this in practice.
Searching for assemblies of interest can be done with The Filter Bar at the top, using the selected metadata column, simply by typing in the value you are looking for.
Try selecting "Genotyphi Genotype" in the Typing table, and typing '4.3.1'. The map view will be then restricted to members of that genotype and highlighted on the tree.
For species where suitable references are available, a population tree is provided. This shows a tree of the species' references and the number of assemblies that have been clustered with them. Selecting one of these nodes will allow you to view that subset of the assemblies in the same collection view.
The first number next to the reference strain's name is the number of assemblies in the collection that cluster with it. The second is the number of publicly available assemblies that cluster with it.
Select H12ESR00755-001A
and wait for the subtree to load. There are five assemblies from the original collection in this subset and 25 from other sources (i.e. outside of Europe).
Look at the group of assemblies that includes TY5359
(use the search bar to find it in the tree if you don't see it near the top of the page), and select that clade by clicking on the parent node. Currently this clade contains four of the original collection and one from outside of Europe. The new assembly is from Thailand, whereas the two nearest to it are marked as having travelled from Thailand recently to Norway and the Netherlands in the Country
metadata field. The next nearest are from the UK and have no clear link.
Look at the group of assemblies that includes TY5359
(use the search bar to find it in the tree if you don't see it near the top, containing of the page), and select that clade by clicking on the parent node. Currently this clade contains four of the original collection and one from outside of Europe. The new assembly is from Thailand, whereas the two nearest to it are noted as having travelled from Thailand recently to Norway and the Netherlands in the Country
metadata field. The next nearest are from the UK and have no clear link.
You can use the reference cluster comparisons to try to find international transmission and other links between your collection and assemblies from other sources.
In the top right corner of both the Genome Browser and the Collection Views, a Downloads
button is available. From here you can download typing, AMR, core similarity and others for local exploration.
FASTAs of the assembly, and an accompanying GFF-format file with Pathogenwatch features included.
The tree can be downloaded as a Newick file by right-clicking and selecting the option.
You can upload your own assemblies, with accompanying metadata, for analysis and comparison with the Pathogenwatch public assemblies via the Uploads tab. For a complete explanation and examples see the Uploading New Genomes section.
Detailed descriptions and guides to each component are provided in the "How To Use Pathogenwatch" section.
Full descriptions of the underlying methods, such as for computing the core distance matrix, are provided in the Technical Descriptions Section.
We also recommend checking the FAQ and contacting us.
Email: pathogenwatch@cgps.group