A "Getting Started" Tutorial
Last updated
Last updated
Pathogenwatch provides a platform for comparing pathogen genome assemblies from around the world, integrating diverse data sets with rich representations. To get started, try following the tutorial below, exploring a pre-existing data set. It's also possible to submit your own assemblies - data is kept private unless requested. Instructions for this are provided in the section
While you are free to browse Pathogenwatch anonymously, there is a limit on the number of assemblies you can include within a collection, and you will not be able to upload assemblies for processing. We provide a variety of login methods in the "Hamburger Menu" on the top left, and you will also be prompted to log in if you wish to upload assemblies.
Select Genus
/ Salmonella
and then select Salmonella enterica
as shown below.
Now select the Map
tab. This will bring up a map-based display of all Salmonella enterica assemblies with geographic data. Use the map selection tool (see image below) to select the European assemblies.
Click the Create Collection
button on the top right of the screen, give the collection a title and click Submit
.
From here you can explore several aspects of a collection:
Geographic distribution - assemblies are displayed on the map, and a multi-select tool is included, allowing the analysis of their distribution in the tree.
The tree can be manipulated in various ways, including viewing subtrees, changing the layout mode, label and node size. Try increasing the node size to make selected genomes clearer.
Metadata can be selected for display on next to the leaves (assemblies) of the tree by clicking on the column header of interest. Increasing or shrinking the text size in the tree viewer can help with readability.
User-uploaded metadata is displayed in the main tab.
For species where suitable references are available, a population tree is provided. This shows a tree of the species' references and the number of assemblies that have been clustered with them. Selecting one of these nodes will allow you to view that subset of the assemblies in the same collection view.
Select H12ESR00755-001A
and wait for the subtree to load. There are five assemblies from the original collection in this subset and 25 from other sources (i.e. outside of Europe).
Look at the group of assemblies that includes TY5359
(use the search bar to find it in the tree if you don't see it near the top of the page), and select that clade by clicking on the parent node. Currently this clade contains four of the original collection and one from outside of Europe. The new assembly is from Thailand, whereas the two nearest to it are marked as having travelled from Thailand recently to Norway and the Netherlands in the Country
metadata field. The next nearest are from the UK and have no clear link.
Look at the group of assemblies that includes TY5359
(use the search bar to find it in the tree if you don't see it near the top, containing of the page), and select that clade by clicking on the parent node. Currently this clade contains four of the original collection and one from outside of Europe. The new assembly is from Thailand, whereas the two nearest to it are noted as having travelled from Thailand recently to Norway and the Netherlands in the Country
metadata field. The next nearest are from the UK and have no clear link.
You can use the reference cluster comparisons to try to find international transmission and other links between your collection and assemblies from other sources.
FASTAs of the assembly, and an accompanying GFF-format file with Pathogenwatch features included.
The tree can be downloaded as a Newick file by right-clicking and selecting the option.
Detailed descriptions and guides to each component are provided in the "How To Use Pathogenwatch" section.
Full descriptions of the underlying methods, such as for computing the core distance matrix, are provided in the Technical Descriptions Section.
We also recommend checking the FAQ and contacting us.
First visit, or click the "Genomes" tab at the top. The bar on the left contains filters that allow you to select a subset of assemblies.
In any view, you can click on the genome's name to pop up a about it. From here, you can also access , if it's available for that species. This can be used to identify closely related strains.
This will bring up the of the assemblies you have selected. It will include a core distance-based neighbour-joining tree, embedded in an (top left), the location of the assemblies displayed in the (top right), a at the bottom, and a at the top centre.
Collection successfully created! In the you can select assemblies individually or in groups by clicking on nodes in the tree, points on the map, rows in the table, or by creating a filter in the Filter Bar. Selected assemblies are highlighted on the map and tree (as above) and the metadata filtered to that set.
are provided for each species as applicable. MLST is available for all supported species while species-specific schemes are also provided. These include for Salmonella Typhi (as in this collection) and (Neisseria gonorrhoea-specific.
predictions from are displayed in the Antibiotics/SNPs/Genes tables accessed via the selector in the corner of the table region. Resistance is given as "Unknown/Intermediate/Resistant -> White/Yellow/Red". The same colours are used to visualise the resistance profiles in the tree and map when selected. Click on different antibiotics and resistance genes to see this in practice.
Searching for assemblies of interest can be done with The at the top, using the selected metadata column, simply by typing in the value you are looking for.
In the top right corner of both the and the , a Downloads
button is available. From here you can download typing, AMR, core similarity and others for local exploration.
You can upload your own assemblies, with accompanying metadata, for analysis and comparison with the Pathogenwatch public assemblies via the Uploads tab. For a complete explanation and examples see the section.
Email: