Construct the De Bruijn Graph of a Collection of k-mers solved by 415

July 29, 2015, 1:11 a.m. by Rosalind Team

Given an arbitrary collection of k-mers Patterns (where some k-mers may appear multiple times), we define CompositionGraph(Patterns) as a graph with |Patterns| isolated edges. Every edge is labeled by a k-mer from Patterns, and the starting and ending nodes of an edge are labeled by the prefix and suffix of the k-mer labeling that edge. We then define the de Bruijn graph of Patterns, denoted DeBruijn(Patterns), by gluing identically labeled nodes in CompositionGraph(Patterns), which yields the following algorithm.

DEBRUIJN(Patterns)
represent every k-mer in Patterns as an isolated edge between its prefix and suffix
glue all nodes with identical labels, yielding the graph DeBruijn(Patterns)
return DeBruijn(Patterns)

De Bruijn Graph from k-mers Problem

Construct the de Bruijn graph from a collection of k-mers.

Given: A collection of k-mers Patterns.

Return: The de Bruijn graph DeBruijn(Patterns), in the form of an adjacency list.

Sample Dataset

GAGG
CAGG
GGGG
GGGA
CAGG
AGGG
GGAG


Sample Output

AGG -> GGG
CAG -> AGG,AGG
GAG -> AGG
GGA -> GAG
GGG -> GGA,GGG