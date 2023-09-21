BY Jared Newman3 minute read

“Dump the data. Destroy the models. Start again.”

To Matthew Butterick, that’s one desired outcome from the lawsuits that he and the Joseph Saveri Law Firm have filed against generative AI’s biggest names. They’re the attorneys representing the comedian Sara Silverman and other authors in cases against OpenAI and Meta, which allege that the companies trained their models on copyrighted text from pirated sources. They’ve also filed similar lawsuits against Stability AI, Midjourney, and DeviantArt over AI image generation, and have accused Microsoft’s Github of violating open-source licensing. The fundamental argument in each case is similar: AI companies allegedly trained their models on copyrighted material or open-source content, and have in turn reproduced that content without permission. Butterick and Saveri say the results will effectively compete with the humans on whose work AI relies. “It’s just ludicrous that their works are being taken without consent, without credit, without compensation, to train these AIs and put them out of business,” Butterick says.

Butterick wasn’t even actively practicing law when he started raising concerns about generative AI. He had been focused on typography, design, and engineering when Microsoft launched GitHub’s Copilot tool, which can generate blocks of code in response to text prompts. In June 2022, he wrote a blog post (entitled “This copilot is stupid and wants to kill me“), which floated the idea of litigation around the tool’s ability to skirt open-source licensing. That post struck a nerve with Saveri, who’s been an antitrust and competition lawyer for more than 35 years. He was already a fan of Butterick’s work and had even used one of his fonts for his law firm’s output. Saveri suggested that they research the topic further, and they quickly figured that they had a case against Microsoft. Lawsuits involving AI artwork and text naturally followed. Finding plaintiffs wasn’t difficult. While Butterick had reached out to Silverman proactively—they’d corresponded in the past, and he thought she’d be a great fit as both a striking writer and actress—in most cases the prospective plaintiffs came to them.

“We kind of tapped into a vein of concern here,” Saveri says. “A lot of these folks share this concern, became aware that we are the only people . . . who are doing anything about it, and they came to us with some urgency.” Critics have argued that the lawsuits are a longshot, and AI enthusiasts say it’s too late to bottle up the models that companies have unleashed. Butterick and Saveri have clearly heard it all before. Isn’t AI just learning the way a human would? Maybe, but humans infringe copyright too, they say, and the bigger issue is that AI is allegedly copying that content without permission for training purposes. Isn’t that just fair use? No, they say, because that content competes with its source material. “People say, ‘This must be fair use. It’s transformative. It doesn’t look like the training data.’ But that’s not what transformative means. It means economically transformative,” Butterick says. “If these text generators are just being used to make books that compete with the training data, we really don’t think it’s going to be fair use.”

Along with an algorithmic reset, Butterick and Saveri say they’re seeking a model that lets creators decide whether to participate in AI training and compensates them for doing so. This in turn would require more transparency from AI firms, which have become increasingly flaky about their training data. Whether the courts agree with their position will take years to resolve. Butterick and Saveri don’t expect the Github case to reach trial for another year and a half, while the cases involving Stability and OpenAI will take even longer. AI may change the world—for better or worse—long before then. But Butterick and Saveri believe the lawsuits are already having an impact. One of their clients, the artist Karla Ortiz, testified before the Senate Judiciary Committee in July, and the attorneys say were encouraged by the responses from both sides of the aisle. Saveri notes that a growing skepticism has emerged around generative AI, and believes the lawsuits have contributed to that conversation.

“One of the reasons the litigation is so important is, we’re kind of first movers in the area,” Saveri says. Not that the AI companies themselves are slowing down. Since filing the lawsuits, Butterick hasn’t been surprised by much—he expected threats and criticism, for instance—but says the brazenness of the companies involved has stood out. “They are saying less about their data sets. They are being more flagrant about harvesting more data, and using it, and basically saying ‘Come and get us,'” he says. “Okay, challenge accepted.”