Many proteins have already been identified for a wide variety of organisms. Accordingly,
there are a large number of protein databases available, and so the first step after
creating a mass spectrum for an unidentified protein is to search through these
databases for a known protein with a highly similar spectrum. In this manner,
many similar proteins found in different species have been identified, which aids
researchers in determining protein function.
In “Comparing Spectra with the Spectral Convolution”, we introduced the spectral convolution and used it to measure the
similarity of simplified spectra. In this problem, we would like to extend this
idea to find the most similar protein in a database to a spectrum taken from an unknown protein. Our plan
is to use the spectral convolution to find the largest possible number of masses that each database protein
shares with our candidate protein after shifting, and then select the database protein having the largest
such number of shared masses.
Given: A positive integer n followed by a collection of nprotein stringss1, s2, ..., sn
and a multiset R of positive numbers (corresponding to the complete spectrum of some unknown protein string).
Return: The maximum multiplicity of R⊖S[sk] taken over all strings sk,
followed by the string sk for which this maximum multiplicity occurs
(you may output any such value if multiple solutions exist).