Matching a Spectrum to a Protein solved by 324

July 31, 2012, midnight by Sonya Alexandrova

Topics: Computational Mass Spectrometry

Searching the Protein Database

Many proteins have already been identified for a wide variety of organisms. Accordingly, there are a large number of protein databases available, and so the first step after creating a mass spectrum for an unidentified protein is to search through these databases for a known protein with a highly similar spectrum. In this manner, many similar proteins found in different species have been identified, which aids researchers in determining protein function.

In “Comparing Spectra with the Spectral Convolution”, we introduced the spectral convolution and used it to measure the similarity of simplified spectra. In this problem, we would like to extend this idea to find the most similar protein in a database to a spectrum taken from an unknown protein. Our plan is to use the spectral convolution to find the largest possible number of masses that each database protein shares with our candidate protein after shifting, and then select the database protein having the largest such number of shared masses.


The complete spectrum of a weighted string $s$ is the multiset $S[s]$ containing the weights of every prefix and suffix of $s$.

Given: A positive integer $n$ followed by a collection of $n$ protein strings $s_1$, $s_2$, ..., $s_n$ and a multiset $R$ of positive numbers (corresponding to the complete spectrum of some unknown protein string).

Return: The maximum multiplicity of $R \ominus S[s_k]$ taken over all strings $s_k$, followed by the string $s_k$ for which this maximum multiplicity occurs (you may output any such value if multiple solutions exist).

Sample Dataset


Sample Output


Please login to solve this problem.