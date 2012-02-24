Innovation in DNA sequencing methods have advanced so rapidly since the end of the Human Genome Project in 2003 that, last month, Illumina Inc. and Life Technologies Inc. (the leading companies in the field) each announced new products that can sequence a genome for $1,000 in a single day. That’s 3 million times cheaper than during the Human Genome Project.

The announcements from these companies were followed by healthy gains in their stock prices. After all, the more productive you make these machines, the fewer machines customers need. The more productive you make the reagents to drive these machines, the fewer times customers reorder. The only guarantee in this story is that the customer will happily enjoy plummeting prices while sequencing companies quietly compete themselves out of business.

The industry has heralded the promise of the genome for years, but if the bellwether companies are commoditizing themselves into oblivion, where is the value?

The human genome is big: just about 3 billion of what are called bases. When companies sequence a genome, they try to read all those bases at least 30 times, if not 45, or even 100. That’s to be sure they’ve read each base enough times to correct for errors in sequencing. Once you have this raw data, you need to identify the places where this genome is different from a so-called “normal” genome. That’s a massive “big data” problem that combines parallel cloud computing and storage with biology. And when you’re done with that, you’re left with somewhere around 3 million genetic “abnormalities” per genome–that is, places where you see variation as compared to the canonical human genome that was sequenced in the 1990s.

So which of those abberations matter? How does one go from 3 million genetic variants to a diagnosis that drives a single choice about patient treatment; one that does so accurately, reliably, and that can be reproduced every time?

You may have heard that this problem, dubbed “genome interpretation,” is the bottleneck in genome sequencing technology: We can sequence the genes, but that doesn’t mean we know what they mean. The data sets are so large and unwieldy that the pace of computer technology can’t keep up with the pace of sequencing. The $1,000 genome requires a “$1 million interpretation.”

But that was last year’s problem. Enabled by a new breed of cloud-based, big-data software companies built for clinical diagnostics, molecular labs across the country are using whole and partial genome sequencing to automate and operationalize real diagnostics. Physicians now have access to accurate diagnoses twice as fast and 20 times as cheap. All of these changes are driven by advances from software companies, whose innovation has begun to outpace that of the sequencing companies in the last 18 months.