Genomic Context Search
Using the cgMLST clustering facility to identify related strains
For species where there is a cgMLST scheme available, we also provide a single-linkage clustering-based service for finding strains within a specified threshold. The cgMLST search tool can be found in the pop-up genome report. For technical details about how cgMLST allele and ST codes are assigned, follow the links to the cgMLST and clustering documentation.
Click on the genome name in any view to pop up the Genome Report and if there is a cgMLST scheme available the clustering panel will be visible in it.
The first time a genome is uploaded, it is often necessary to first run the clustering by clicking on the "Cluster Now" button. This can take a few minutes, depending on server usage and the size of the data set involved. If you're looking at an old record but you've uploaded new genomes since you last ran the clustering, you can add these in by pressing the "Re-cluster" button.
The dark purple node includes the query sequence. The lighter purple nodes are directly connected to the query. Grey nodes are linked to the query via at least one other node.
The resulting network shows genomes linked to the query genome by fewer differences than the selected distance threshold (number of different alleles). This threshold can be modified using the slider under the chart showing the number of genomes at each value. Each node in the network represents one or more genomes that either connect directly to the query genome at less or equal to the threshold or can be linked to it via a chain of connected genomes.
This network can be opened in a collection view and the related genomes investigated ("View Cluster"). Alternatively the list of linked genomes can be opened in the Genome List view, from where detailed analysis downloads and tree-based collections are available ("List Genomes"). Note that you will need to click the "Clear Filters" button to reset the Genome List view afterwards.