Organizing Strings of Different Lengths
In “Enumerating k-mers Lexicographically”, we introduced the lexicographic order for strings of the same length constructed from some ordered underlying alphabet. However, our experience with dictionaries suggests that we should be able to order strings of different lengths just as easily. That is, we already have an intuitive sense that "APPLE" comes before "APPLET", which comes before "ARTS," and so we should be able to apply this intuition toward cataloguing genetic strings of varying lengths.
Say that we have strings
Given: A permutation of at most 12 symbols defining an ordered alphabet
Return: All strings of length at most
D N A 3
D DD DDD DDN DDA DN DND DNN DNA DA DAD DAN DAA N ND NDD NDN NDA NN NND NNN NNA NA NAD NAN NAA A AD ADD ADN ADA AN AND ANN ANA AA AAD AAN AAA
We can combine conditions (1) and (2) above into a single condition by adding a blank character
$\emptyset$to the beginning of our ordered alphabet. Then, if $s$is shorter than $t$, we simply add as many instances of $\emptyset$as necessary to make $s$and $t$the same length.