Your genome, the strands of DNA in the cells of your body, contains more than 3 billion base pairs—chemical bonds that act like letters of code. Amazingly, new technologies can take a drop of blood or saliva and read someone's whole genome in about a day, for as little as $1,000 (down from close to $47,000 six years ago). Unsurprisingly, they don’t get everything right. No one knows for sure how far off the results are, and how much the accuracy varies from one test and company to another.
That's one of the main challenges facing those who are at the forefront of a new wave of health care, called precision medicine, which depends on such whole-genome sequencing for everything from treating heart disease to developing patient-specific cancer treatments. The challenges of genetic testing accuracy have prompted the U.S. Food and Drug Administration (FDA) to get involved. It's not out to bust companies for poor practices or false claims (at least, not yet) but rather to get them and other experts working together to improve the science, through a program called precisionFDA.
Dr. Taha Kass-Hout, FDA's chief health informatics officer, compares the agency's new collaborative approach to Wikipedia's. "If you have a Wikipedia article, we're all interested in updating it and making sure it's always current and always great," he says. "And that's really what we've done with precisionFDA."
PrecisionFDA provides a cloud-computing platform where companies and other researchers can test and refine the software that reads the raw results of gene-sequencing machines and assembles them into a transcript of someone's genetic code. Wiki-style, participants can also share any of their data and experiences with others, in order to advance everyone's understanding.
The program has gained about 1,000 participants internationally since it was introduced in December 2015. That includes individuals, institutions like the American Heart Association and Emory University, pharma companies like Roche, genetics-testing companies like Illumina and 23andMe, and Silicon Valley firms like Intel. Government bodies such as the National Institutes of Health (NIH) are also involved.
Close to 6.5 feet long, if stretched out, an entire genome can't be analyzed in one piece. Instead, it's chopped into little fragments. Reading technologies take advantage of the basic property of DNA: Each of the four possible chemical bases (letters of code) that join the double-stranded DNA molecule binds to only one of the other chemical bases (forming what look like steps on a ladder). Some reading methods, for example, add color-tagged bases to the DNA. By seeing where certain colors appear, analyzing machines can tell what bases in the DNA they have glommed onto. Sequencing machines will read hundreds or even thousands of copies of each fragment, to ensure that any erroneous readings don't make it into the raw readout of every letter pair in each of the DNA fragments (a data file called FASTQ).
All of that is actually the easy part, thanks to the latest technologies.
The harder part now—the one the FDA is focusing on—is taking that readout of fragments and putting the data back together in the right order to produce a custom map of someone's DNA. "A lot of that is art. A lot of that is also science," says Kass-Hout. Various private firms and research institutions have developed their own algorithms, and the only thing known for sure is that none is perfect. They may all be close, some may be way off, or some may be good at identifying certain parts of the genome but not others.
To find out, the FDA is sponsoring a friendly competition, running until April 25. It will provide participants with the raw data readout from an extremely well-studied genome with the unremarkable name NA12878, provided by a public-private group called the Genome in a Bottle Consortium. "It was characterized by different sequencing technologies, so there was a lot of knowledge about that sample," says Živana Težak, associate director at the FDA's Office of In Vitro Diagnostics.
Participants will try their hand at the NA12878 data file, running their algorithms on it to see how close they come to the expected sequence. They will also run the same raw file several times, to see how consistent their software is. It may turn out that some software works well at reading certain parts of a genome but not others. "If some tests are inaccurate in some parts, then you want to know that," says Težak. "You want to know that you maybe cannot diagnose a certain disease using certain algorithms or software."
Such limitations are important for new cancer treatments that aim to direct the body's own immune system against tumors. For that to work, doctors need to discover the chemical makeup of the tumor cells by sequencing the mutations that created them. Genetically bespoke treatment is a major component of the Obama administration's Cancer Moonshot initiative to encourage research and collaboration around developing cancer treatments and cures.
Discovering mutations unique to a single patient is a big step forward from looking for a well-known mutation that causes a hereditary disease such as breast cancer or a specific type of blindness. "Cancer, like the Moonshot, might be a really hard problem to address as opposed to something like hereditary disease," says Kass-Hout.
The FDA will publish the names of companies and organizations that do best in the challenge, but precisionFDA isn't just a place to hold contests. And participants that don't do well or that pull out early won't be in trouble with regulators. Rather, the FDA is encouraging participants to keep at it with additional challenges. This will help each individual player get better, but the FDA is also encouraging them to share the lessons learned, so that everyone does a better job at the highly precise rapid gene sequencing that medicine will increasingly depend on.
Companies may be wary of sharing their secret sauce, but they also have some incentives to do just that. Smaller firms that cannot afford their own supercomputing resources might sign on for the free use of the precisionFDA cloud computing services. Silicon Valley firms that have ideas for better data crunching but don't know genetics well would get a chance to collaborate with biotech companies. And all firms might decide that it's worth giving up some of what they know in order to find out what everyone else knows.
Companies also get an early say in the process that will ultimately lead to regulations in their industry. "This is the first [time] where we're utilizing this technique to bring people together to inform regulatory science," says Elaine Johanson, deputy director of FDA's Office of Health Informatics.
It's also a chance for the FDA to get a jump on new technologies, rather than scrambling to react after they are developed. "All those [participants] have far deeper expertise in this area and have worked on it for decades," says Kass-Hout. "Why not bring them all together [and] provide them with the ingredients that they already work with for benchmarking or advancing the science?"
UPDATE: This article has been updated to clarify the titles of Taha Kass-Hout and Živana Težak.