Feb. 19, 2014, 6:27 p.m. by Rosalind Team
Given a string Text and Trie(Patterns), we can quickly check whether any string from Patterns matches a prefix of Text. To do so, we start reading symbols from the beginning of Text and see what string these symbols “spell” as we proceed along the path downward from the root of the trie, as illustrated in the figure below. For each new symbol in Text, if we encounter this symbol along an edge leading down from the present node, then we continue along this edge; otherwise, we stop and conclude that no string in Patterns matches a prefix of Text. If we make it all the way to a leaf, then the pattern spelled out by this path matches a prefix of Text.
This algorithm is called PREFIXTRIEMATCHING.
PREFIXTRIEMATCHING finds whether any strings in Patterns match a prefix of Text. To find whether any strings in Patterns match a substring of Text starting at position k, we chop off the first k − 1 symbols from Text and run PREFIXTRIEMATCHING on the shortened string. As a result, to solve the Multiple Pattern Matching Problem (introduced in “Construct a Trie from a Collection of Patterns”, we simply iterate PREFIXTRIEMATCHING |Text| times, chopping the first symbol off of Text before each new iteration.
Given: A string Text and a collection of strings Patterns.
Return: All starting positions in Text where a string from Patterns appears as a substring.
AATCGGGTTCAATCGGGGT ATCG GGGT
1 4 11 15