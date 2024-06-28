BY Gary Smith and Jeffrey Funk7 minute read

The scientific revolution has increased our understanding of the world immensely and improved our lives immeasurably. Now, many argue that science as we know it could be rendered passé by artificial intelligence. Way back in 2008, in an article titled, “The End of Theory: The data deluge makes the scientific method obsolete,” Chris Anderson, the then-editor-in-chief of Wired magazine, argued that,

Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot. Since then the chorus has gotten louder. In 2023, for example, Eric Schmidt, a former Google CEO, wrote that, AI can rewrite the scientific process. We can build a future where AI-powered tools will both save us from mindless and time-consuming labor and also lead us to creative inventions and discoveries, encouraging breakthroughs that would otherwise take decades. Today, AI is being increasingly integrated into scientific discovery to accelerate research, helping scientists generate hypotheses, design experiments, gather and interpret large datasets, and write papers. But the reality is that science and AI have little in common and AI is unlikely to make science obsolete. The core of science is theoretical models that anyone can use to make reliable descriptions and predictions. Thus Paul Samuelson wrote that science is public knowledge, reproducible knowledge. When Robert Adams wrote an MIT thesis on the accuracy of different forecasting methods, he found that ‘being Sumner Slichter’ was apparently one of the best methods known at that time. This was a scientific fact, but a sad scientific fact. For Slichter could not and did not pass on his art to an assistant or to a new generation of economists. It died with him, if indeed it did not slightly predecease him. What we hope to get by scientific breakthrough is a way of substituting for men of genius men of talent and even just run-of-the-mill men. That is the sense in which science is public, reproducible knowledge. The core of AI, in contrast, is, as Anderson noted, data mining: ransacking large databases for statistical patterns: “correlation is enough.” If anything, public knowledge is viewed as hindering an unfettered search for statistical patterns.

However, without an underlying causal explanation, we don’t know whether a discovered pattern is a meaningful reflection of an underlying causal relationship or meaningless serendipity. Tests with fresh data can expose a pattern as coincidental but there are an essentially unlimited number of patterns that can be discovered, and many coincidental patterns and spurious correlations will survive repeated testing and retesting. For example, if we calculate the pairwise correlations among one million variables, each one of which is nothing more than randomly generated numbers, we can expect nearly 8,000 correlations to be statistically significant in the initial tests and through five rounds of re-testing. In practice, there are far more than one million variables and algorithms are not restricted to pairwise correlations. In addition, there are often not enough data for multiple rounds of retesting needed to show just how many data-mined patterns are coincidental. We ultimately need expert opinion in order to discard obviously coincidental patterns a priori and identify plausible causal models that can be tested and retested, ideally with randomized controlled trials. Without this, as we are too often painfully reminded, all we have is correlation—which is often fleeting and useless.

Two of Schmidt’s examples of AI rewriting the scientific process involve large language modes (LLMs). His first example: Artificial intelligence is already transforming how some scientists conduct literature reviews. Tools like PaperQA and Elicit harness LLMs to scan databases of articles and produce succinct and accurate summaries of the existing literature—citations included. We now know that LLM literature reviews are unreliable. In May of 2023, two months before Schmidt’s article was published, a credulous lawyer submitted a legal brief that had been largely written by ChatGPT to a Manhattan court. When pressed about fake citations that ChatGPT had included in the filing, ChatGPT obliged by generating fake details of fake cases. The judge was familiar with the relevant precedents and rebuked (and later fined) the lawyer for submitting a brief that was full of “bogus judicial decisions . . . bogus quotes and bogus internal citations.” That, in a nutshell is the problem with relying on LLMs for literature reviews and other factual information. If you know the facts, you don’t need an LLM. If you don’t know the facts, you can’t trust an LLM. Schmidt’s second example:

As with LLMs, what these AI systems do is amazing but claims about the implications are exaggerated. In two Science op-eds, Derek Lowe, a researcher who has worked on several drug discovery projects, wrote that “it doesn’t make as much difference to drug discovery as many stories and press releases have had it” because “protein structure determination simply isn’t a rate-limiting step in drug discovery.” As Lowe argues: It’s important to realize that the new protein computational tools do not make all these into solved problems. Not even close. They clear out a lot of obstacles so that we can get to these problems more easily and more productively, for sure, but they do not solve them once we get up to the actual rock faces in our particular gold mines. The CEO of the AI-powered drug company, Verseon, was more blunt: “People are saying, AI will solve everything. They give you fancy words. We’ll ingest all of this longitudinal data and we’ll do latitudinal analysis. It’s all garbage. It’s just hype.” The real test is whether new products and services are developed faster and cheaper with AI than without it. In a 2024 Science op-ed, Lowe examined drugs that were purportedly designed by AI and concluded that none of them can be classified as “target discovered by AI.”