Character-Based Phylogeny solved by 231

July 2, 2012, midnight by Rosalind Team

Topics: Phylogeny

Introduction to Character-Based Phylogenyclick to expand

In “Creating a Character Table”, we discussed the construction of a character table from a collection of characters represented by subsets of our taxa. However, the ultimate goal is to be able to construct a phylogeny from this character table.

The issues at hand are that we want to ensure that we have enough characters to actually construct a phylogeny, and that our characters do not conflict with each other.

Problem

Because a tree having n nodes has n1 edges (see “Completing a Tree”), removing a single edge from a tree will produce two smaller, disjoint trees. Recall from “Creating a Character Table” that for this reason, each edge of an unrooted binary tree corresponds to a split SSc, where S is a subset of the taxa.

A consistent character table is one whose characters' splits do not conflict with the edge splits of some unrooted binary tree T on the n taxa. More precisely, S1Sc1 conflicts with S2Sc2 if all four intersections S1S2, S1Sc2, Sc1S2, and Sc1Sc2 are nonempty. As a simple example, consider the conflicting splits {a,b}{c,d} and {a,c}{b,d}.

More generally, given a consistent character table C, an unrooted binary tree T "models" C if the edge splits of T agree with the splits induced from the characters of C.

Given: A list of n species (n80) and an n-column character table C in which the jth column denotes the jth species.

Return: An unrooted binary tree in Newick format that models C.

Sample Dataset

cat dog elephant mouse rabbit rat
011101
001101
001100

Sample Output

(dog,(cat,rabbit),(rat,(elephant,mouse)));

Please login to solve this problem.