March 18, 2014, 8:26 p.m. by Rosalind Team
In “Find the Longest Repeat in a String”, we saw that suffix trees can be too memory intensive to apply in practice.
In 1993, Udi Manber and Gene Myers introduced suffix arrays as a memory-efficient alternative to suffix trees. To construct SuffixArray(Text), we first sort all suffixes of Text lexicographically, assuming that "$" comes first in the alphabet. The suffix array is the list of starting positions of these sorted suffixes. For example,
Construct the suffix array of a string.
Given: A string Text.
Return: SuffixArray(Text).
AACGATAGCGGTAGA$
15, 14, 0, 1, 12, 6, 4, 2, 8, 13, 3, 7, 9, 10, 11, 5