Glossary

PHYLIP format

PHYLIP format derives from Joe Felsenstein's PHYLogeny Inference Package and is now used by several phylogenetics programs. PHYLIP file names typically have have a .phy or .ph extension.

There are two different ways of representing a collection of genetic strings in PHYLIP format:

The "sequential" type, which is perhaps easier to work with, alternates between string labels and the strings themselves. Here's an example of a small "sequential" file.

3 30
Taxon1     ACCGTTTCCACAGCATTATGG
GCTCGATGA
Taxon2     CACTTCACAAATCAATATTGA
GCTAGTGCA
Taxon3     TAAGGTATTGGGCTTGGTTCG
CAGGGGACT

The "interleaved" type shows all the first lines of the strings, then all the second lines, and so on. Here's an example of a small "interleaved" file.

3 30
Taxon1     ACCGTTTCCACAGCATTATGG
Taxon2     CACTTCACAAATCAATATTGA
Taxon3     TAAGGTATTGGGCTTGGTTCG
GCTCGATGA
GCTAGTGCA
CAGGGGACT

Note that PHYLIP is an alignment format, meaning that all sequences must have the same length (which includes the case that we include 'N' as an unknown character or "-" as a gap symbol). This explains the two integers beginning a PHYLIP file: the first integer indicates the number of strings, and the second integer represents the length of each string.

Wikipedia