SARS-CoV-2 Genome Tree
How SARS-CoV-2 trees are constructed in Pathogenwatch.
Last updated
How SARS-CoV-2 trees are constructed in Pathogenwatch.
Last updated
Pathogenwatch will automatically generate a tree of SARS-CoV-2 genomes when a collection is created from the . When each genome is uploaded to Pathogenwatch an alignment against the Wuhan Hu 1 reference genome is stored. The selected genomes are aligned into a multiple sequence alignment and a dendrogram produced using . This tree is then displayed in the interactive collection viewer.
The aligned FASTA output is stored in Pathogenwatch.
The FASTA files are concatenated into a multiple sequence alignment along with the wuhan-hu-1 reference.
Root the resulting tree to the reference.
Remove the reference from the tree.
Each genome is mapped against the wuhan-hu-1 reference genome () using .
The resulting SAM file from each genome is converted into FASTA format using .
Run with the options -gtr -nosupport -nt
.
FastTree: Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 -- Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. .
goFASTA:
minimap2: Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100.