PC Magazine wrote:Researchers Turn DNA Into Digital Storage Medium
By Damon Poeter
The next great digital storage medium may be us—or our DNA, to be precise.
Deoxyribonucleic acid stores the code that makes us humans and not, say, flatworms. Which is to say that DNA is remarkably evolved storage media that can pack in all the variety and complexity of organic life in just a small amount of biological matter.
But turning DNA into storage for digital and not biological information is tough because it's proven difficult to encode efficiently and reliably using artificial means, say researchers at the EMBL-European Bioinformatics Institute (EMBL-EBI).
Scientists have already figured out how to read the data stored in long dormant DNA, but now those researchers say they've worked out a way to write it in such a way that overcomes earlier hurdles.
In the latest issue of Nature EMBL-EBI researchers Nick Goldman and Ewan Birney explain that their breakthrough could make it possible to "store at least 100 million hours of high-definition video in about a cup of DNA."
"We already know that DNA is a robust way to store information because we can extract it from wooly mammoth bones, which date back tens of thousands of years, and make sense of it. It's also incredibly small, dense and does not need any power for storage, so shipping and keeping it is easy," Goldman said in a statement.
There have been a couple of challenges to storing digital data as DNA, the scientists said, noting that until now it's only been possible to create short strings of DNA and the repeatability of DNA letters in a string can make it difficult to both write and read.
Goldman and Birney said they enlisted the help of bio-analytics instrument maker Agilent Technologies, a former lab of Hewlett-Packard, to help synthesize DNA from encoded digital information—in this case, an MP3 of Martin Luther King's "I Have a Dream" speech, a .txt file of Shakespeare's sonnets, a .pdf file containing James Watson and Francis Crick's original paper describing the structure of DNA, and a final file describing the encoding itself.
"We knew we needed to make a code using only short strings of DNA, and to do it in such a way that creating a run of the same letter would be impossible," Goldman explained.
"So we figured, let's break up the code into lots of overlapping fragments going in both directions, with indexing information showing where each fragment belongs in the overall code, and make a coding scheme that doesn't allow repeats. That way, you would have to have the same error on four different fragments for it to fail—and that would be very rare."
The result, according to Agilent's Emily Leproust, who helped synthesize the data into DNA, was "hundreds of thousands of pieces of DNA" that looked "like a tiny piece of dust." Agilent sent the synthesized sample back to the researchers at EMBL-EBI, where they sequenced it and said they decoded the files without errors.
"We've created a code that's error tolerant using a molecular form we know will last in the right conditions for 10,000 years, or possibly longer. As long as someone knows what the code is, you will be able to read it back if you have a machine that can read DNA," Goldman said.
So can we toss out our hard disk drives and SSDs? Probably not too soon, but the researchers said they believed working out several practical matters could result in a "commercially viable DNA storage model."
For more from Damon, follow him on Twitter @dpoeter.