SARS-CoV-2 Genome Tree
How SARS-CoV-2 trees are constructed in Pathogenwatch.
Pathogenwatch will automatically generate a tree of SARS-CoV-2 genomes when a collection is created from the Genome Browser. When each genome is uploaded to Pathogenwatch an alignment against the Wuhan Hu 1 reference genome is stored. The selected genomes are aligned into a multiple sequence alignment and a dendrogram produced using FastTree. This tree is then displayed in the interactive collection viewer.
SARS-CoV-2 tree built using the Pathogenwatch pipeline.
- Each genome is mapped against the wuhan-hu-1 reference genome (NCBI Reference Sequence: NC_045512.2) using minimap2.
- The aligned FASTA output is stored in Pathogenwatch.
- The FASTA files are concatenated into a multiple sequence alignment along with the wuhan-hu-1 reference.
- Root the resulting tree to the reference.
- Remove the reference from the tree.
- FastTree: Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 -- Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. doi:10.1371/journal.pone.0009490.
- minimap2: Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100. doi:10.1093/bioinformatics/bty191