Quartet Distance solved by 106

Dec. 21, 2012, 5:35 p.m. by Rosalind Team

Topics: Phylogeny

Another Tree Distance

In “Phylogeny Comparison with Split Distance”, we examined the split distance for comparison of different phylogenies on the same collection of taxa.

Yet quartet-based phylogeny offers another way in which two phylogenies can be compared (see “Quartets” and “Counting Quartets”). Specifically, we wonder how many quartets can be inferred from one tree but not inferred from the other.

Problem

In “Counting Quartets”, we found an expression for $q(T)$, the number of quartets that can be inferred from an unrooted binary tree containing $n$ taxa.

If $T_1$ and $T_2$ are both unrooted binary trees on the same $n$ taxa, then we now let $q(T_1, T_2)$ denote the number of inferred quartets that are common to both trees. The quartet distance between $T_1$ and $T_2$, $d_{\textrm{q}}(T_1, T_2)$ is the number of quartets that are only inferred from one of the trees. More precisely, $d_\textrm{q}(T_1, T_2) = q(T_1) + q(T_2) - 2q(T_1, T_2)$.

Given: A list containing $n$ taxa ($n \leq 2000$) and two unrooted binary trees $T_1$ and $T_2$ on the given taxa. Both $T_1$ and $T_2$ are given in Newick format.

Return: The quartet distance $d_\textrm{q}(T_1, T_2)$.

Sample Dataset

A B C D E
(A,C,((B,D),E));
(C,(B,D),(A,E));

Sample Output

4

Please login to solve this problem.