Suggested problems

Chromosome walking (Old-school sequencing)

Oct. 10, 2012, 5:39 a.m. by Mike Rayko

Biological Motivation

Before the era of th nex-gen sequencing the most common way to determine the nucleotide sequence was the Sanger method. Sanger-based capillary sequencers was a workhorses in the Celera Genomics and Human Genome Project. Now this technique widely used in the tasks concerned individual genes, when using whole-genome analyzers is untenable or too expensive and time-consuming. Usually every DNA fragment is sequenced in two opposite directions, to increase the reading accuracy. As an output of this method we get from the each run two DNA sequences, one readed from 5' to 3' of the plus strand ("sequence from the forward primer"), and other from 5' to 3' of the opposite strand ("sequence from the reverse primer"). So, if we'll take the last one and do familiar thing - just build a reverse complement. And we have two strings where the end of the first one is overlapped with the beginning of the second one. And we just have to concatenate them correctly! Unfortunately, we can obtain only relatively short (300-1000 nucleotides long) DNA fragments this way. To read larger fragments method called chromosome walking was developed. The point was to use end of the sequenced strand as a primer for the next part of the long DNA sequence. For the record, human genome is 3*10^9 bp long. It was a really long walk.

Problem

Results of sequencing two neighbour fragments (A and B) in both directions (for and rev) obtained. Need to restore the original sequence (A+B)

Hint: Use the reverse complement and don't forget inevitable overlapping

Given: Two pairs of DNA strings (Afow, Arev, Bfow, Brev) 400-700 bp long

Return: AB sequence