Enumerating k-mers Lexicographically solved by 3427

July 2, 2012, midnight by Rosalind Team

Topics: String Algorithms

Organizing Strings

When cataloguing a collection of genetic strings, we should have an established system by which to organize them. The standard method is to organize strings as they would appear in a dictionary, so that "APPLE" precedes "APRON", which in turn comes before "ARMOR".

Problem

Assume that an alphabet $\mathscr{A}$ has a predetermined order; that is, we write the alphabet as a permutation $\mathscr{A} = (a_1, a_2, \ldots, a_k)$, where $a_1 < a_2 < \cdots < a_k$. For instance, the English alphabet is organized as $(\textrm{A}, \textrm{B}, \ldots, \textrm{Z})$.

Given two strings $s$ and $t$ having the same length $n$, we say that $s$ precedes $t$ in the lexicographic order (and write $s <_{\textrm{Lex}} t$) if the first symbol $s[j]$ that doesn't match $t[j]$ satisfies $s_j < t_j$ in $\mathscr{A}$.

Given: A collection of at most 10 symbols defining an ordered alphabet, and a positive integer $n$ ($n \leq 10$).

Return: All strings of length $n$ that can be formed from the alphabet, ordered lexicographically.

Sample Dataset

T A G C
2

Sample Output

AA
AC
AG
AT
CA
CC
CG
CT
GA
GC
GG
GT
TA
TC
TG
TT

Note

As illustrated in the sample, the alphabet order in this problem is defined by the order in which symbols are provided in the dataset, which is not necessarily the traditional order of the English alphabet.

Please login to solve this problem.