July 29, 2015, 1:03 a.m. by Rosalind Team

We defined a mismatch in “Compute the Hamming Distance Between Two Strings”. We now generalize “Find the Most Frequent Words in a String” to incorporate mismatches as well.

Given strings *Text* and *Pattern* as well as an integer *d*, we define *Count*_{d}(*Text*, *Pattern*) as the total number of occurrences of *Pattern* in *Text* with at most *d* mismatches. For example, *Count*_{1}(**AACAA**GCTG**ATAAACA**TTT**AAAGA**G, **AAAAA**) = 4 because **AAAAA** appears four times in this string with at most one mismatch: **AACAA**, **ATAAA**, **AAACA**, and **AAAGA**. Note that two of these occurrences overlap.

A **most frequent k-mer with up to d mismatches** in

*Find the most frequent k-mers with mismatches in a string.*

Given: A string *Text* as well as integers *k* and *d*.

Return: All most frequent *k*-mers with up to *d* mismatches in *Text*.

ACGTTGCATGTCGCATGATGCATGAGAGCT 4 1

GATG ATGC ATGT

## Note

The algorithm for solving the Frequent Words with Mismatches Problem becomes rather slow as

kanddincrease. In practice, your solution should work fork≤ 12 andd≤ 3.