Computational biology—the application of coding, mathematical models, and large-scale data processing to biology—hasn't turned into a huge buzz term the way that "big data" and the "Internet of things" have. But it will, very soon. The unheralded effort by hospitals, universities, pharmaceutical companies, governments, NGOs, and tech firms to marry biology to heavy-duty server power will change the way medicine works. In one of the most intriguing use cases to come out lately, two famous institutions are using big data to tackle a particularly difficult variety of brain cancer.
Last week, the New York Genome Center (NYGC) announced a new partnership with IBM. The Genome Center, a consortium of tech-forward area institutions like Memorial Sloan Kettering, Mount Sinai, Columbia University, and New York University, is using IBM Watson to find alternative treatments for glioblastoma, an aggressive form of brain cancer that kills 13,000 Americans annually. It's a potentially heavyweight use for Watson, a $1 billion-plus investment for IBM, which has also been in the news for more fanciful pursuits like its time as a Jeopardy contestant and a customized recipe creator.
Given the size of its investment, an argument could be made that IBM is staking its future on Watson. That's a theme we'll return to later, but in the meantime, let's look at how Watson is actually used to find cancer treatments.
Between 20 to 25 glioblastoma patients at the Genome Center's member institutions will be selected for the Watson study; the patients are chosen by the member hospitals themselves. Each patient's tumors will be sequenced at the Genome Center on Illumina servers running algorithms in the IBM SoftLayer cloud.
This is where the computational biology part of things comes in. Biopsies are conducted on patients, and both normal and cancer cells are sequenced by the Genome Center's servers. The sequencing normally takes 10-12 days because of the intricacy of the task; regular cells have to be sequenced approximately 30 times and cancer cells 30-50 times. In the slow-but-groundbreaking process, algorithms developed through years of public and private sector research create perfect representations of the patients' cells in bits and bytes.
In the next stage, which takes a few weeks, the raw sequences for healthy and cancerous cells are extrapolated and put through heuristic algorithms to figure out what healthy and cancerous cells look like in each patient. This information is used to create variant call files—raw info files used by the Genome Center's software to store gene sequence variations. These files are what Watson uses to find novel cancer treatments.
Steve Harvey, an IBM executive, told Fast Company in a phone conversation that each variant file can contain between 20,000 to 1 million potential mutations. Among them is a "driver mutation" that primarily fuels the cancer, and "passenger mutations" that have much less effect. Watson combines findings from the Genome Center's programs with automated queries of a massive medical text database to attempt to identify the driver mutation.
"There are a couple of famous cases in the personalized medicine corpus where cancer shows up in the body, but researchers are finding today that the same mutation shows up in different parts of the body," Harvey said. "In certain cases, when they do whole genome sequencing, it leads to the use of a drug to treat the mutation that wasn't necessarily intended for the cancer a person has. In one case, a drug designed to treat kidney cancer successfully treated an overactive gene in a patient with leukemia—something we didn't know before."
Using the medical text corpus, Watson then proposes medications—including drugs that made it through various stages of the FDA process but never arrived on the market—-that could be used to potentially treat the patient's cancer. Boards at the hospitals actually treating the patients finally take Watson's recommendations under consideration and evaluate their feasibility for fighting the individual patient's glioblastoma.
Watson is just one of many software platforms being used to tackle extremely thorny health issues through computational biology. One NYGC member institution, Mount Sinai's Icahn School of Medicine, hired Cloudera cofounder and early Facebook data scientist Jeff Hammerbacher to oversee the hospital's big data efforts. Using software from companies like pharma/sports/intelligence outfit Ayasdi, Mount Sinai runs extremely complicated algorithmic exercises on data sets like a patient's diabetes history. In one project Fast Company's Co.Exist gained access to, biomedical informatics director Joel Dudley visualized the data of more than 300,000 patients to help pinpoint specific genes which might be influencing the course of one individual's Type 2 diabetes.
Then there are the many smaller companies building servers and software platforms designed for pure science efforts that could someday lead to the creation of better drugs and treatments. One of them, Silicon Valley-based Bina Technologies, manufactures integrated software/hardware appliances designed for gene sequencing. CEO Narges Bani Asadi told Fast Company that platforms such as her company's product drastically decrease the time it takes for researchers to decode cells; in one example, genetics researchers at Stanford University were able to shrink a 12-day genome decoding process into just two days. This does more than allow research to progress faster; it also saves organizations like Stanford and the NYGC money that can then be reinvested in more research.
Because computational biology is a relatively new field, it is wide open for both smaller and larger companies to turn themselves into market leaders. For larger companies like IBM that are widely believed to have declining market share and cash-flow problems, industries with well-funded customers like the National Institutes of Health, Stanford University, and the New York Genome Center are a godsend. Big Blue's hope is that Watson—and products which leverage Watson—will become a commonplace part of the computational biology arsenal, and lock customers into IBM's ecosystem with lucrative multi-year contracts. The company has already partnered with WellPoint to make a series of Watson-based tools.
[Image: Flickr user r.nial.bradshaw]