Packed inside every cell in your body
is a set of genetic instructions,
3.2 billion base pairs long.
Deciphering these directions
would be a monumental task
but could offer unprecedented insight
about the human body.
In 1990, a consortium
of 20 international research centers
embarked on the world's largest
biological collaboration
to accomplish this mission.
The Human Genome Project proposed
to sequence the entire human genome
over 15 years
with $3 billion of public funds.
Then, seven years
before its scheduled completion,
a private company called Celera announced
that they could accomplish the same goal
in just three years
and at a fraction of the cost.
The two camps discussed a joint venture,
but talks quickly fell apart
as disagreements arose over legal
and ethical issues of genetic property.
And so the race began.
Though both teams used the same technology
to sequence the entire human genome,
it was their strategies
that made all the difference.
Their paths diverged
in the most critical of steps:
the first one.
In the Human Genome Project's approach,
the genome was first divided into smaller,
more manageable chunks
about 150,000 base pairs long
that overlapped each other
a little bit on both ends.
Each of these fragments of DNA
was inserted inside a bacterial
artificial chromosome
where they were cloned and fingerprinted.
The fingerprints showed scientists
where the fragments overlapped
without knowing the actual sequence.
Using the overlapping bits as a guide,
the researchers marked
each fragment's place in the genome
to create a contiguous map,
a process that took about six years.
The cloned fragments were sequenced
in labs around the world
following one of the project's
two major principles:
that collaboration on our shared heritage
was open to all nations.
In each case, the fragments
were arbitrarily broken up
into small, overlapping pieces
about 1,000 base pairs long.
Then, using a technology
called the Sanger method,
each piece was sequenced letter by letter.
This rigorous map-based approach
called hierarchical shotgun sequencing
minimized the risk of misassembly,
a huge hazard of sequencing genomes
with many repetitive portions,
like the human genome.
The consortium's
"better safe than sorry" approach
contrasted starkly with Celera's strategy
called whole genome shotgun sequencing.
It hinged on skipping
the mapping phase entirely,
a faster, though foolhardy, approach
according to some.
The entire genome was directly chopped up
into a giant heap
of small, overlapping bits.
Once these bits were sequenced
via the Sanger method,
Celera would take the formidable risk
of reconstructing the genome
using just the overlaps.
But perhaps their decision
wasn't such a gamble
because guess whose freshly completed map
was available online for free?
The Human Genome Consortium,
in accordance with
the project's second major principle
which held that all of the project's data
would be shared publicly
within 24 hours of collection.
So in 1998, scientists around the world
were furiously sequencing
lines of genetic code
using the tried and true, yet laborious,
Sanger method.
Finally, after three exhausting years
of continuous sequencing and assembling,
the verdict was in.
In February 2001, both groups
simultaneously published
working drafts of more than 90%
of the human genome,
several years ahead
of the consortium's schedule.
The race ended in a tie.
The Human Genome Project's practice
of immediately sharing its data
was an unusual one.
It is more typical for scientists
to closely guard their data
until they are able to analyze it
and publish their conclusions.
Instead, the Human Genome Project
accelerated the pace of research
and created an international
collaboration on an unprecedented scale.
Since then, robust investment in both
the public and private sector
has led to the identification
of many disease related genes
and remarkable advances
in sequencing technology.
Today, a person's genome can be sequenced
in just a few days.
However, reading the genome
is only the first step.
We're a long way away from understanding
what most of our genes do
and how they are controlled.
Those are some of the challenges
for the next generation
of ambitious research initiatives.