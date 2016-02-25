Modern medicine is incredibly data-intensive, especially now that doing a full genetic sequencing of every patient is becoming affordable and more prevalent. New efforts to develop tailored treatments for illnesses–known as precision medicine–will require collecting and sharing data on a hard-to-comprehend scale. That’s the biggest takeaway from the White House Precision Medicine Initiative (PMI) Summit.

It was big data meets genetics, with the live-streamed presentations and panel discussions (one featuring President Obama) featuring the data-science abbreviation API—application programming interface—as much as DNA. Aside from the inspirational speeches (and some self-congratulatory exchanges), the summit included a huge list of initiatives by dozens of government bodies, tech, and biotech companies, and nonprofits to advance tailored medicine. The projects are dominated by data collection, storage, and–most importantly–sharing. Getting patients access to their own health records—currently a Kafkaesque bureaucratic process—is also one of the goals, as well as making it easier for patients to donate their data to research.

The undertaking, which is connected with the government’s Cancer Moonshot, will have to overcome massive logistical barriers, as well as the challenges of big egos and bureaucratic inertia.

First, there’s the scale: The underlying concept of precision medicine is to refine the understanding of someone’s illness based on their specific genetic makeup and other personalized medical data. One person’s kidney cancer, for instance, may not be like that of another. In fact, researchers have already discovered 16 genetic variants of kidney cancer that, until recently, would have all been treated the same. “I couldn’t practice medicine really without what we now call precision medicine,” said Dr. Marston Linehan, chief of urologic oncology at the National Cancer Institute (NCI). “It helps us decide what operation to do, whether to do an operation or not, what drug to give.”

Knowing the exact mutations that cause a particular cancer, neurodegenerative disease, digestive ailment, or other illness promises to make it easier to pick the best medication or develop new ones targeted to an individual’s condition. But that requires a lot of data. Sequencing the genomes of just the people who will be diagnosed with cancer this year (about 1.65 million) will amount to four exabytes—four billion gigabytes—of data, or 400,000 times all the information in the Library of Congress, said Eric Dishman, general manager of Health and Life Sciences at Intel, and a cancer survivor. “This is one of the biggest of the big-data challenges that we’re ever going to have to solve to be able to share this data,” he said.

The federal government is already gearing up to collect info at that scale. Today the National Institutes of Health (NIH) announced the PMI Cohort Program, which will enroll at least one million people for a longitudinal study—one that tracks people’s health over many years—in order to learn about a variety of diseases. Vanderbilt University and Verily, Google’s big-data health spin-off, are being tapped to pilot the project, which aims to recruit its first 79,000 participants by the end of the year.

This is in addition to the Million Veteran Program run by the Department of Veterans Affairs, with help from the Department of Defense (both big recipients of new proposed cancer research funding, too). The program has already signed up about 455,000 service personnel (vets and active duty) who have volunteered to share their medical data for research, including 400,000 genes.