Nov. 12, 2012, 9:40 p.m. by Rosalind Team

## Introduction to Distance-Based Phylogeny

A number of different approaches are used to build phylogenies, each one featuring its own computational strengths and weaknesses. One of these measures is distance-based phylogeny, which constructs a tree from evolutionary distances calculated between pairs of taxa.

A wide assortment of different measures exist for quantifying this evolutionary distance. Once we have selected a distance function and used it to calculate the distance between every pair of taxa, we place these pairwise distance functions into a table.

In this problem, we will consider an evolutionary function based on Hamming distance. Recall from “Counting Point Mutations” that this function compares two homologous strands of DNA by counting the minimum possible number of point mutations that could have occurred on the evolutionary path between the two strands.

For two strings

For a general distance function

Given: A collection of

Return: The matrix

>Rosalind_9499 TTTCCATTTA >Rosalind_0942 GATTCATTTC >Rosalind_6568 TTTCCATTTT >Rosalind_1833 GTTCCATTTA

0.00000 0.40000 0.10000 0.10000 0.40000 0.00000 0.40000 0.30000 0.10000 0.40000 0.00000 0.20000 0.10000 0.30000 0.20000 0.00000