Genome sequencing

Genome sequencing is the term used for obtaining a blueprint describing the order of nucleotides on every chromosome of an organism's genome.

Researchers presently do not have the capability of directly reading the sequence of nucleotides on an entire genome like we would read a book; instead, they must apply more indirect methods. As a result, genome sequencing generally is typically divided into three steps. The first step requires chemically blowing up (multiple copies of) the genome into a number of small pieces, called reads.

These minuscule reads are then fed through a sequencer, which can apply advanced chemical methods to determine the order of bases on the reads; each sequencer has a certain length of reads that it will be able to identify.

Finally, researchers must use efficient algorithms to reassemble chromosomes from the sequenced reads in a process known as fragment assembly; the general principle of fragment assembly is relying on overlapping reads to reconstruct longer pieces.

The first genome to be sequenced was that of phage $\Phi \textrm{-X174}$ in 1977 by Fred Sanger. Research progressed until the 3 billion dollar project to sequence the human genome culminated in 2001 with the publication of a draft genome representing an average-case genome from 12 different individuals.

Over the last decade, genome sequencing became a multi-billion dollar per year industry, as the cost of sequencing a complete genome has plummeted from the billions to the tens of thousands of dollars. Most researchers project that the 21st Century will include cheap complete genome sequencing as a routine medical procedure, which would create an era of personalized medicine: drugs and treatments could be tailored specifically to a patient's precise genetic makeup. Accordingly, research into genome sequencing is still very competitive.