Using Electricity and CRISPR, Scientists Enable Inheritable Data Storage on Bacterial DNA
DNA is the blueprint that contains all the information a cell/organism needs to replicate and therefore is one of the most efficient ways of storing data. For example, the entire human genome, which codes for all the biomolecules that the body needs for survival and reproduction, is over 3 billion base pairs. However, the entire genomic DNA can be condensed in a volume in the microliter range.
DNA as a Data Storage Medium
Theoretically speaking, if we can encode information transcribed into the letters A, T, G, C that make up the genetic code, we can store 100s of gigabytes of data in DNA that takes up less space than a speck of salt.
DNA as a storage device has been contemplated for a while now. The basic principle would be to convert the binary code of computer data into combinations of A, T, G, C and use genetic tools such as DNA synthesis to write the code into DNA. The codes are usually broken down to 200-300bases long segments to reduce errors in the writing process, with each segment containing a unique ID.
When this written DNA is sequenced, the unique ID helps to piece together multiple segments to reassemble the original data. However, de novo synthesis of DNA without a template is very expensive. Additionally, stored DNA is easily degraded over time.
Currently, it takes about $3500 to transcribe 1MB of data into the DNA. The attraction to develop DNA as a storage device comes from the fact that DNA is at the core of all biological research, and with more developments in biomedical tech, DNA writers and readers are expected to become more affordable in the future.
We already see this in the case of DNA readers, where incremental developments in DNA sequencing technology have enabled the cost of sequencing the human genome to drop from a whopping $100million in 2001 to less than $1000 in 2020.
Writing on DNA Using CRISPR and Electricity
To address the issue of writing the code into DNA, a team of researchers from Columbia University has developed a CRISPR-based method to use electronic impulses to write on the DNA of living bacteria. This also solves the DNA degradation problem since the code will be kept intact as the bacterium divides and propagates its DNA into daughter cells.
The same team, led by Dr. Harris Wang, had previously used an inducible system to introduce and increase extrachromosomal plasmid gene expression in the presence of fructose. In the absence of the fructose signal, the bacteria incorporated the inserted plasmid DNA into its genome using the CRISPR gene-editing technology. With sequencing, the inserted DNA can be read. The limitation of this system was the amount of data that could be incorporated into the genome.
As an improvement, the authors replaced the fructose-inducible system with an electronic input that could insert longer strings of code. With the engineered redox-responsive CRISPR adaptation system, they could induce increased gene expression in response to an increase in electric voltage, encoding digital data into bacterial genome without the need to synthesize the DNA in vitro.
Using a 3-bit binary data stream, the investigators could input multiple combinations over three sequential rounds to be stored in the cell as CRISPR arrays, which are then incorporated into the bacterial DNA with unique IDs. The barcodes allow for an increase in the data storage capacity and were able to write a 72-bit message into E.Coli cells which read ‘Hello world!’.
The encoded message was stably passed onto replicating cells with over 90% accuracy in 80 generations, suggesting that it is a robust strategy to store, amplify and recover data in living cells, eliminating the need for elaborate systems for DNA storage. Bacterial cells are far more resistant to damage than raw DNA, as demonstrated by the introduction of these engineered microbes into the soil and then recovered to check for the encoded data.
Future Implications
Data-encoded cells could be enriched in natural environments by supplementing with a medium that contained antibiotics that the coded cells can resist and grow. Naked DNA stored in similar conditions was degraded while the data-encoded cells protected the information.
While there is a long way to go before DNA becomes the default mode of information storage, this study explores the possibility of a direct digital to a biological data recorder. We need to develop methods to write larger and larger amounts of data into the DNA with high levels of recovery, minimize effects from mutations and reassemble the data from its constituent fragments in a robust and cost-effective manner.
By Sahana Shankar, Ph.D. Candidate
Related Article: Genomics Companies Form an Alliance with Microsoft for DNA Data Storage
References
- https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data
- https://www.nature.com/articles/s41589-020-00711-4
©www.geneonline.com All rights reserved. Collaborate with us: [email protected]