Authors: Adam Rutherford
Imagine that every sentence in this book were a single gene: the human genome would be a book forty times longer than this one, filled with random text but with my sentences distributed randomly throughout it. How would you identify which of them were the relevant sentences? Language is studded with punctuation to complement the letters and add composite meaning above the words themselves. Indeeditisprettydifficultforustoextractmeaningfroma sentencethathasnospacesorpunctuationinit.
Fortunately for science, and necessarily for the cell, DNA is no different. Before scientists could work out what the words and sentences of DNA mean, the first challenge was to work out where the spaces wereâwhere a gene begins and endsâand this meant working out its punctuation.
By the 1960s, scientists knew that life was built of or by proteins, that proteins were built from amino acids, and that DNA was the hereditary matter that coded the proteins. The big gap was getting from one to the other, from DNA code to protein. By experimentally inserting a molecule in between two letters of functioning DNA, Crick and another future Nobel Prize laureate named Sydney Brenner intentionally disrupted its code and prevented a protein from being produced. These disruptions are known as frameshift mutations, like a film projector whose shutter speed is wrong, so you see half of one frame and half of another. By interrupting DNA with inserts equivalent to one, two, or four bases, the same effect was observedâa disrupted protein. But with an insertion of three, a protein product was still viable. Crick and Brenner deduced from this that the code of DNA works in sequences of three bases. This pattern is known as a reading frame: it places spaces in a length of DNA, exposing the meaningful triplets of letters.
Starting from there, an American and a German scientist cracked the first coded message in DNA in 1961 using a process of elimination. Instead of trying to work out how a naturally occurring sequence of DNA translated into a protein, Marshall Nirenberg and J. Heinrich Matthaei strung together a length of genetic code that only consisted of the base thymine (the letter
T
). They inserted this into the mechanics of a working cell and provided it with a ready supply of amino acids from which a protein might be built. They knew that there are only twenty amino acids from which all life is built, so in each of twenty different test tubes containing all of them, a single different amino acid within the mixture was tagged with a radioactive label. This meant that if a protein that resulted from their doctored DNA buzzed with radiation, they would be able to identify which amino acid was coded by a triplet of thymine bases. When they extracted the proteins from their various test tubes, the result was nineteen duds and one that set the Geiger counters buzzing. They discovered that a genetic code consisting exclusively of the letter
T
resulted in a protein consisting exclusively of the amino acid phenylanine.
And so the nature of the irrepressible code was known. Over the next few years, by varying the template, the unique triplets encoding the other nineteen amino acids were discovered, until by the end of the 1960s we had a complete readout of how DNA encodes proteins.
Here is a small section of DNA, part of a gene:
cctgggaccaacttcgcgaagcgggaagcccggcgg
Here is the same sequence broken into the triplet reading frame, as the cell reads it:
cct  ggg  acc  aac  ttc
 gcg  aag  cgg  gaa  gcc  cgg  cgg
And here it is again with each amino acid (here written in their abbreviated form) alongside each codon:
cct  ggg  acc  aac  ttc  gcg  aag  cgg  gaa  gcc  cgg  cgg
Pro
Â
Gly
Â
Thr
Â
Asn
Â
Phe
Â
Ala
Â
Lys
Â
Pro
Â
Glu
Â
Ala
Â
Arg
Â
Arg
That string of amino acids forms part of a protein.
So how could scientists tell where a gene begins and ends? There is also punctuation in the language of DNA. In the continuous run of
A
s,
T
s,
G
s
,
and
C
s, the cell knows where a gene begins because without exception, they all start with the letters
ATG,
the so-called start codon, like a capital letter at the beginning of a sentence. Similarly, all genes end with a period, a stop codon, of which there are three:
TGA, TAG,
and
TAA
. A reading frame for an entire gene always begins with
ATG
and ends with one of these three stop codons.
Proteins are, then, long strings of amino acids as decreed by the DNA that encodes them. They perform their functions by folding up into three-dimensional shapes; the grooves, holes, clamps, and pockets in their folded shapes give them all manner of abilities.
5
Proteins also team up to gain new purposes. For example, hemoglobin, which carries oxygen around your body in red blood cells, is made up of four proteins, together carrying a single atom of iron. The astonishing properties of spider silk are the result of the sophisticated complex of different proteins of which it is made, some of them neatly folded, others overlapping to create a high-tensile strength comparable with that of steel. Some proteins are enzymes, which catalyze bodily reactions, the metabolism in cells that keeps us alive. Others are sensory, like the ones embedded in the rods and cones of your retina, so specialized that they can detect a single photon of light and trigger the process of vision. All these properties are the result of proteins folding, connecting, and interacting with others in highly precise ways. Diverse though the actions of proteins may be, the code underlying them is the same in all.
This is the bedrock of biology: the translation of code into action. But what is the process by which this translation takes place? If the genome is a sort of central office containing the plans to the factory, the plans never actually leave the office, so the relevant pages have to be photocopied. In other words, there is an intermediary between the DNA and the site of the manufacture of the protein, and this envoy comes in the form of DNA's cousin, RNA.
As previously explained, DNA is a helix twisted from two struts of a ladder, the rungs linking each together. But RNA is a single strand, the rungs exposed. When a particular protein is required, the double helix of DNA splits in two to expose the relevant gene. A single strand of RNA is laid down on top of the exposed gene, and each letter of that gene is copied into the RNA in mirror form. This envoy is called, appropriately enough, messenger RNA, for it carries the message from the genome to the site of protein construction.
Later on we will see that RNA plays a far more central role in the origin of life than its mere messenger status suggests. However, we have now arrived at the first and biggest clue to the question posed by the phrase “few forms or into one.” The system described above is truly universal. There are no life-forms we know of that do not employ and entirely depend upon it: DNA, made of four letters, translates into proteins, made of twenty amino acids. It is known as the central dogma: DNA makes RNA makes protein. The fact that all known life is utterly dependent on this system makes it seem almost inconceivable that it is not related by a single, common origin. Certainly, working out how such a system might come about will be essential if we're to understand how life as we know it came to be.
6
The Right Hand of Life
If the shared code and tools of life are not enough to point unequivocally to a single origin of life, here's an oddity of biology that emphatically seals the deal. Hold your hands up with your palms toward you. They are reflections of each other. If you slide one in front of the other, palms still facing toward you, one cannot obscure the other as your thumbs stick out. This, of course, is the basis of the glove industry: a left glove won't fit on a right hand. And so it is with the molecules of life. In the same way that there are mirror forms of hand, so, too, are there mirror forms of certain molecules.
An atom is made up of a central nucleus, which is positively charged, surrounded by negatively charged electrons. These electrons form the bonds that hold individual atoms together in molecules and, like the poles on a magnet, these electrons repel one another. So they attempt to maximize the space between them. When atoms bond with other atoms to make a molecule, they will space those bonds as equally apart as they can, and so tend to adopt, wherever possible, symmetrical forms. An atom that has three available bonds, such as nitrogen, will tend to form a triangular molecule, with a bond at each of its corners. Carbon dioxide, meanwhile, is a straight molecule: the carbon atom has four available bonds, oxygen has two, and so the carbon atom is flanked by two oxygen atoms, each occupying a double bond. But if an atom naturally seeks to make four bonds, as carbon does, the farthest spacing of bonds is not in two dimensions but in three: a pyramid with a triangular base. The carbon atom sits in the center of this shape, with the four points equally spaced from one another. That's fine if all of the atoms on the corners are the same, or even if two or three of them are different from one another. But as soon as all four corners are occupied by a different atom, handedness becomes a possibility. In other words, exactly the same atoms in the same conformation can be arranged as mirror forms. In chemistry, this is called chirality, from the ancient Greek word meaning “hand.”
We sometimes describe life on Earth as being carbon based, by which we mean that all DNA and proteins are built from frames featuring carbon atoms. And carbon, as we have seen, can form “handed” molecules, mirror images of one another, so it might be reasonable to suppose that we would find a mixture of right- and left-handed carbon-based molecules in life. What is remarkable is that, wherever life is concerned, the amino acids that make up proteins are left-handed.
7
It was the killer of that persistently unfruitful idea of spontaneous generation, Louis Pasteur, who first identified chemical handedness. One of the ways we can spot the chiral nature of a molecule is by simply shining light through it.
Imagine tying a Slinky to a wall and then waving it in all directions. The waves would be up and down as well as side to side. If you then fed the Slinky through a vertical slit and did the same, only vertical waves would pass through. Certain molecules have exactly this filtering effect on light. They can polarize light into a single plane despite its natural tendency to wave in all directions. Louis Pasteur noticed that shining a beam through a solution of a simple molecule called tartaric acid that had been purified from wine finings caused the light to be polarized.
8
Yet he also noticed that tartaric acid that had been made in the lab did not have this property, despite being chemically identical. The reason is this: the molecules in tartaric acid are based on carbon, and when synthesizing the molecule in the lab, both the left- and right-handed version are made in equal measure, the random chance of flipping a coin. This means that any polarizing effect of one version is canceled out by the opposite action of the mirror molecule. But the tartaric acid found in finings was naturally synthesized in yeast cells as part of the wine-making process. As a result, they are all right-handed. These partisan molecules would have allowed only one plane of light through.
This minor diversion into the realms of wine, chemical bonds, and spatial thinking is absolutely essential, because for reasons we don't fully understand, proteins only use left-handed amino acids. It is a mystery why all twenty amino acids that all proteins are made of are left-handed. Around four-fifths of humans are right-handed, so the equivalent with people would be as if southpaws simply did not exist.
Pasteur's revelation that the nonbiological production of handed molecules resulted in left- and right-handed versions in equal measure foreshadowed one of the most dreadful medical tragedies in history. The drug thalidomide is a mild but effective tranquilizer and painkiller. Its efficacy across a range of conditions from insomnia to headaches earned it the status of a “wonder drug.” But it was also an effective treatment for preventing vomiting, and so was offered to pregnant women suffering from morning sickness. Between 1957, when it was introduced, and 1962, when it was globally withdrawn, more than ten thousand babies were born with severe birth defects including stunted limbs and other abnormalities as a result of being exposed to thalidomide in the womb. Thalidomide is a chiral molecule, and it was determined that, of the two mirror versions, only one had the mutating effect. As in Pasteur's tartaric acid, its production in pharmaceutical factories did not account for the two mirror versions, and the drug was sold with both hands in equal measure. We now know that either version of thalidomide has the confounding ability to flip to its reflection once in the body, so even in its harmless version, it would potentially be harmful to the developing embryo. It's still prescribed in some countries as an effective treatment for leprosy and other illnesses. But thalidomide's use by pregnant women is necessarily and strictly forbidden.
Handedness is a biological phenomenon, with proteins being almost exclusively on the left. The origin of this preference is not known. It might be as simple as a chance event, but one that stuck. The uniformity points toward the development of a system with a single point of origin. If life had evolved twice, the likelihood would be that we would see both left- and right-handed proteins.
DNA has a similar but opposite steadfast bias: it is always right-handed. Point the index finger on your right hand and draw an imaginary clockwise circle in the air. At the same time move your hand away from your body. That is the turn of the double helix in its most common form, like a typical wood screw.
9
Even though that mirror-image double helix could perfectly well exist, it just doesn't. Life, yet again betraying its unique origin, only uses right-handed DNA.