Processing math: 100%

Glossary

Consistent characters

A consistent collection of characters is one for which a binary tree T can be constructed whose splits match those taken from the split notations of the characters.

The split notation is a way for a character to be encoded into a split S1Sc1, where S1 and Sc1 are the two disjoint sets of taxa that the character separates. On the other hand, removing an edge e from a binary tree results in the creation of two subtrees, each containing a subset of the taxa, which also induces a split S2Sc2. Thus, we can label e with the split S2Sc2, and so we have two ways of generating a split: from a character and from an edge of a binary tree.

A collection of characters is called consistent if a binary tree T can be found whose edge splits do not conflict with the splits inferred from characters. Such a conflict will occur precisely when we have one split S1Sc1 inferred from a character, another split S2Sc2 inferred from an edge of T, and all four intersections S1S2, S1Sc2, Sc1S2, Sc1Sc2 are nonempty.

For a simple example, consider the conflicting quartets {a,b}{c,d} and {a,c}{b,d}, which must correspond to two distinct trees on the four taxa a, b, c, and d.