The k-mer composition of a string s encodes the number of times that each possible k-mer occurs in s.
To represent the k-mer composition of a string concisely, all possible k-mers (in the case
of DNA strings, there will be 4k total k-mers) are ordered lexicographically,
and then an arrayA is created in which A[i] represents the number of
times that the ith of these ordered k-mers appears in s.
The k-mer composition is a generalization of GC-content to the case of substrings.
In the figure below, we show the array giving the 2-mer composition of "TTGATTACCTTATTTGATCATTACACATTGTACGCTTGTGTCAAAATATCACATGTGCCT".
Report a typo
Page:
Context:
Flag as inappropriate
Are you sure you want to flag this comment as inappropriate?
Welcome to Rosalind!
Rosalind is a platform for learning bioinformatics through problem solving.
Please login with Google/Twitter/Facebook or
register a new account.