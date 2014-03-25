Computational biology–the application of coding, mathematical models, and large-scale data processing to biology–hasn’t turned into a huge buzz term the way that “big data” and the “Internet of things” have. But it will, very soon. The unheralded effort by hospitals, universities, pharmaceutical companies, governments, NGOs, and tech firms to marry biology to heavy-duty server power will change the way medicine works. In one of the most intriguing use cases to come out lately, two famous institutions are using big data to tackle a particularly difficult variety of brain cancer.

Last week, the New York Genome Center (NYGC) announced a new partnership with IBM. The Genome Center, a consortium of tech-forward area institutions like Memorial Sloan Kettering, Mount Sinai, Columbia University, and New York University, is using IBM Watson to find alternative treatments for glioblastoma, an aggressive form of brain cancer that kills 13,000 Americans annually. It’s a potentially heavyweight use for Watson, a $1 billion-plus investment for IBM, which has also been in the news for more fanciful pursuits like its time as a Jeopardy contestant and a customized recipe creator.

Given the size of its investment, an argument could be made that IBM is staking its future on Watson. That’s a theme we’ll return to later, but in the meantime, let’s look at how Watson is actually used to find cancer treatments.

Click to expand | Pathways: Work underway using IBM Watson at the New York Genome Center will apply the use of IBM Watson cognitive technology to map genome sequencing results to retrieve insights from medical literature and drug information to find possible treatment options physicians can recommend to their patients. In this image, a cancer mutation is shown on a cell protein pathway from genome sequencing. Infographic courtesy of IBM

Between 20 to 25 glioblastoma patients at the Genome Center’s member institutions will be selected for the Watson study; the patients are chosen by the member hospitals themselves. Each patient’s tumors will be sequenced at the Genome Center on Illumina servers running algorithms in the IBM SoftLayer cloud.

This is where the computational biology part of things comes in. Biopsies are conducted on patients, and both normal and cancer cells are sequenced by the Genome Center’s servers. The sequencing normally takes 10-12 days because of the intricacy of the task; regular cells have to be sequenced approximately 30 times and cancer cells 30-50 times. In the slow-but-groundbreaking process, algorithms developed through years of public and private sector research create perfect representations of the patients’ cells in bits and bytes.

In the next stage, which takes a few weeks, the raw sequences for healthy and cancerous cells are extrapolated and put through heuristic algorithms to figure out what healthy and cancerous cells look like in each patient. This information is used to create variant call files–raw info files used by the Genome Center’s software to store gene sequence variations. These files are what Watson uses to find novel cancer treatments.

Steve Harvey, an IBM executive, told Fast Company in a phone conversation that each variant file can contain between 20,000 to 1 million potential mutations. Among them is a “driver mutation” that primarily fuels the cancer, and “passenger mutations” that have much less effect. Watson combines findings from the Genome Center’s programs with automated queries of a massive medical text database to attempt to identify the driver mutation.