July 29, 2015, 1:03 a.m. by Rosalind Team
In this chapter, we use the terms prefix and suffix to refer to the first k − 1 nucleotides and last k − 1 nucleotides of a k-mer, respectively.
Given an arbitrary collection of k-mers Patterns, we form a graph having a node for each k-mer in Patterns and connect k-mers Pattern and Pattern' by a directed edge if Suffix(Pattern) is equal to Prefix(Pattern'). The resulting graph is called the overlap graph on these k-mers, denoted Overlap(Patterns).
Construct the overlap graph of a collection of k-mers.
Given: A collection Patterns of k-mers.
Return: The overlap graph Overlap(Patterns), in the form of an adjacency list.
ATGCG GCATG CATGC AGGCA GGCAT
AGGCA -> GGCAT CATGC -> ATGCG GCATG -> CATGC GGCAT -> GCATG