Life on earth originated millions of years ago. Over these millions of years, the simplest of organisms have evolved into complex ones. EVOLUTION is a phenomenon that assures the continuity and survival of organisms. Another important factor in the continuity of a species is the genetic material present in each organism, irrespective of its kind. Humans came into existence comparatively recently, just about a million years ago. Since then, they have not only survived in the wild but also made settlements out of nothing. These settlements were developed into urbanized structures, thereby establishing the world we now know.
In human cells, the genetic material is DNA – DEOXYRIBONUCLEIC ACID. It is enclosed in the NUCLEUS of a cell and is present in the CHROMOSOMES. DNA has a code in the form of NITROGENOUS BASES – ADENINE (A), GUANINE (G), THYMINE (T), and CYTOSINE (C). Human DNA of about 3 billion nitrogenous bases and more than 99% of these bases are the same in all people. The ‘instructions’ for building and maintaining an organism are contained within the DNA in the form of the sequence of the four bases.

Organisms have a lot of biological information, which determines their morphology and anatomy, including undertaking processes in the body of the organism. Each characteristic of the organism is present in its body, and the molecule containing this much information is the DNA molecule. The hard discs available in the markets have very minute storage capacity in comparison to the DNA present in the organism’s body.
The amount of digital data in the world is estimated to be three ZETTABYTES’ worth (3000 billion billion bytes – and archivists face a real challenge when it comes to storing this information as well as the new influx. More data was created in the past few years than in all preceding history. All that torrent of information may soon outstrip the ability of hard drives to capture it. One solution is to use DNA – a compact, robust molecule as a storage medium.
DNA, being an ORGANIC molecule, can survive in harsh conditions – longer than any storage means used as of today. A recent study concluded that all of the world’s data can be stored in approximately ONE KILOGRAM OF DNA. While different institutes claim to calculate the storage capacity of a single gram of DNA, the widely accepted value is about 215 PETABYTES (215 million gigabytes).

Harvard University geneticists and colleagues encoded a 52,000-word book in thousands of snippets of DNA, using strands of DNA’s four-letter alphabet A, G, T, and C to encode the 0s and 1s of a digitalized file. Their encoding scheme was rather ineffective, however, and could only store about 1.52 petabytes per gram of DNA. Other approaches have done better, but none have been able to store more than half of what the researchers think DNA can handle, about 1.8 BITS of data per nucleotide of DNA.
One of the experiments of storing data which have come close to the 215-petabyte storage capacity claim encoded five files in DNA – including 154 of Shakespeare’s sonnets, a snippet of Martin Luther King’s ‘I Have A Dream’ speech, and conveniently, a document of the Watson-Crick model of DNA.
How was information stored in DNA?
A group formulated an algorithm that would correspond the BINARY CODE (0s and 1s) to the GENETIC CODE (A, T, G, and C). The algorithm would then convert the binary code sequence to arranging the genetic code using the cheat sheet. The document must be broken into several pieces as each DNA fragment can store about two hundred bases each. The pieces must be sequenced in the exact order to retrieve them while reading.

This coded information can be fed into DNA synthesis machines which transforms it into physical matter much like a printer laying down the ink on paper. The product obtained is a faint dash of dust, which itself contains several DNA copies of the encoded files. This procedure is expensive and tedious. Although it has been hypothesized that the world’s information can be stored in a kilogram of DNA, there are not enough funds in the world to accomplish this mammoth task.
How was information read in DNA?
A leap in technology was made when devices were constructed which did not require manual decoding of the information in DNA. These have been in use since the HUMAN GENOME PROJECT. One such device is MinION – used for decoding the sequences of RNA and DNA. Each consumable flow cell can generate 5 to 10 gigabytes of DNA sequence data. Hundreds of kilobytes can be read in minimal time.

Constant research and experiments are being performed to advance the technology to consider DNA as an efficient unit of storage. A lot of money is required to carry out these experiments, and with many companies and universities uniting to research the capacity of DNA, these ventures will assure the fund required, the equipment as well as the expertise required. Thereby, it may be safe to assume that a future where storing data in DNA molecules is feasible is not far.