Most everyone has heard of the A’s, G’s, C’s, and T’s of DNA. These four letters form the alphabet for the instructions for all life on the planet.
Now a group of scientists at Scripps in San Diego have taken the first steps to adding two more letters, d5SICS and dNaM, to this universal genetic code. No catchy single letter code for these unnatural bases yet though. Maybe S and N?
The big deal here isn’t that they have found some unnatural new bases they can add to DNA. These have been around for a decade or so. No what makes this astonishing is that a bacterium didn’t mind too much them being there.
With a tweak that allowed the bacteria to take up the new bases, the researchers found that the bacteria happily copied the DNA containing these bases and passed them on to the next generation. And they did a pretty good job of it too. Despite a billions of years of optimizing everything for these four letters, the bacteria shrugged off the new ones and just kept going.
Well, maybe shrugged off is a bit strong. The bacteria ran into problems if there were too many new letters in a row. But still, the mind boggles at the flexibility of the cellular machinery.
The next step will be to get the cell to read these new letters. Right now, they are copied but not understood. It's akin to a medieval monk carefully copying Arabic text he doesn’t understand. This will not be easy to teach a cell but it is doable.
There is complicated machinery in a cell that allows it to read what's written in DNA. Some of this will have to be redesigned so new words can be added to the cell’s dictionary, but this is a technical not a theoretical hurdle. With enough tinkering, it will get done.
So now that we will finally have achieved E.T.’s biology (does anyone else remember he had six bases in his DNA?), the next question is whether this is worth it. Is this the best approach to rejiggering the genetic code? And can these types of changes improve on the current code?
New Words vs. New Definitions
All life on Earth uses the same genetic code for its instructions. It's a very simple language made up of 64 three-letter words made out of a four-letter alphabet. And it isn’t really even that complicated.
The 64 words of the language only have around 21 definitions or so. What this means is that a lot of the words have the exact same definition. For example, TAG, TAA, and TGA all mean the same thing.
The work here wants to expand the words of the code by adding new letters. All the words will still be three letters long (that is way too hard to change), but now instead of the usual 64 we could have up to 216 words.
Of course, since life isn’t getting its full bang for the buck with the 64 words it already has, we may not need such a huge expansion of life’s dictionary. Maybe a better approach is to simply give some of the old words new definitions.
As an example, maybe all the TAG’s could be changed into TAA’s and then TAG could be given a new definition. Now we can add something new without changing the alphabet. George Church’s lab is already doing this sort of thing in bacteria and is making real progress.
So it is an open question which is the better approach. If your goal is to genetically engineer some protein with never before seen parts, the new letter approach might be easier. Since life doesn’t already have that word, you don’t need to change it in all of the bacterium’s genes. You would just need to do it in the piece of DNA you are working on.
But if you want to make a larger scale change, then it might be better to change the meaning of an old word. With enough changes, this approach gives the added bonus of being resistant to the viruses that use that old, natural genetic code.
And of course the new letters aren’t just to form words. They make fundamentally new DNA that can be used by scientists to detect viral infections, make molecules that can better speed up reactions, be used as drugs, and probably lots and lots of other cool things.
All of these applications will be incredibly useful but probably aren’t the only driving force behind these experiments. No, the real reasons go much deeper.
One of these is undoubtedly a thirst to understand in detail how the current genetic code works. In the process of creating a new language, we will need to completely understand how the old one works.
Along the way we may even be able to improve on what Mother Nature has managed to cobble together with billions of years of evolution. After all, while DNA is a marvel, it is far from perfect.
Good vs. Good Enough
DNA is a great little molecule. It is very stable which is important for storing information for the long haul. And because of its double stranded structure, it is very easily copied and passed on from generation to generation.
The key to this last point is base pairing. Every A pairs up with a T and vice versa. The same thing is true for G and C. This arrangement makes it easy to separate the two strands and make a copy by matching up these letters. (This was a key finding from Watson and Crick’s original work.)
This is why these researchers had to add two new letters. The S and the N pair up with each other like the G and the C or the A and the T do. But the unnatural bases pair up in a different way.
Natural bases use a fairly weak force called hydrogen bonds to line up with one another. Turns out this may not be ideal since water pretty easily disrupts these bonds and cells are filled with water.
The new bases use hydrophobic forces which are actually strengthened in the presence of water. This should stabilize the pairing of these bases.
Once scientists swap out a lot of a cell’s natural bases for unnatural ones, then maybe we can learn if the weak hydrogen bonds were actually a good idea or if they were the first thing that worked well all those eons ago and life has simply stuck with it. And this won’t be the only thing we might be able to improve upon.
DNA has a few other properties that at first blush look like they could be improved upon. Scientists may be able to intelligently redesign life so that it has a sturdier and more reliable genetic code. Or it may be that none of our tweaks improves anything much at all. We’ll have to wait and see.