How SARS-CoV-2 trees are constructed in Pathogenwatch.
About
Pathogenwatch will automatically generate a tree of SARS-CoV-2 genomes when a collection is created from the Genome Browser. When each genome is uploaded to Pathogenwatch an alignment against the Wuhan Hu 1 reference genome is stored. The selected genomes are aligned into a multiple sequence alignment and a dendrogram produced using FastTree. This tree is then displayed in the interactive collection viewer.
SARS-CoV-2 tree built using the Pathogenwatch pipeline.
The resulting SAM file from each genome is converted into FASTA format using goFASTA.
The aligned FASTA output is stored in Pathogenwatch.
Tree Building
The FASTA files are concatenated into a multiple sequence alignment along with the wuhan-hu-1 reference.
Run FastTree with the options -gtr -nosupport -nt.
Root the resulting tree to the reference.
Remove the reference from the tree.
References
FastTree: Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 -- Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. doi:10.1371/journal.pone.0009490.