Nine years ago, I sat at ground zero of the face-swapping revolution. I was inside the screening room of Ed Ulbrich, who was hot off building the technology that had transformed Brad Pitt’s visage in The Curious Case of Benjamin Button. Ulbrich had welcomed me to his VFX studio Digital Domain to preview something that could top even the technical magic of Benjamin Button. He dimmed the lights, and the opening notes of his latest opus, Tron Legacy, began to play. Soon, I was face to face with a digitally-reconstructed Jeff Bridges, who wasn’t 60 years old anymore, but a spry 30. To Ulbrich, face-swapping was the Holy Grail of special effects–and we were witnessing its realization.
“It was really hard, it was really slow, it was really tedious, it was really expensive,” Ulbrich said at the time. “And the next time we do it it’s going to be less difficult, and less slow and less expensive.” Digital Domain eventually pivoted to resurrecting Tupac as a hologram, and later declared bankruptcy. But the company left its mark: digital face-swapping has become a mainstay tool of Hollywood, putting Arnold Schwarzenegger in Terminator Genesis and Carrie Fisher in Star Wars: The Force Awakens.
The jaw-dropping effect is still fairly difficult and expensive, though–or it was until an anonymous Redditor named Deepfakes changed all that overnight and brought Ulbrich’s words back to me with perfect clarity. In early 2018, Deepfakes released a unique bit of code to the public that allows anyone to easily and convincingly map a face onto someone else’s head in full motion video. Then, another Redditor quickly created FakeApp, which gave the Deepfakes scripts a user-friendly front end.
The anonymous Redditor premiered their tech by releasing a series of videos featuring famous actresses cast in pornography. Other Redditors quickly followed suit; though users still needed to collect hundreds or even thousands of pictures of someone to use the Deepfakes AI, all of the hard, logical work in training the machine was automated. Suddenly, the entire internet had access to a technique that allowed them to map anyone–acquaintances, minors, enemies–into sexually explicit videos. It was easy to imagine where it would go next.
The law was almost no help. So to cut the trend off at the pass, both Reddit and Pornhub quickly banned Deepfakes content on their own, classifying it similarly to revenge porn. For now, Deepfakes has been quarantined to the some of the cruder message boards of the internet. But to people like Ulbrich who watched this very technology come into being, there is no stopping the revolution afoot.
“Porn is probably the least offensive part of this,” Ulbrich tells me now. “Pixel by pixel, it begins the war of what’s real.”
The Making Of Deepfakes
“It may be considered a fetish,” the user known as Deepfakes writes me via Reddit. “Some people like feet, some people like big breasts, some people like familiar and attractive faces in their porn. It can also explain why I’m obsessed with generating realistic face images.”
Deepfakes won’t tell me their name, their age, or their profession, but they are quite open in talking about what they believe, and what brought them to release their code to the masses. For the past twelve years, Deepfakes has been obsessed with facial mapping technologies, first borne from a stint playing the 2006 video game The Elder Scrolls IV: Oblivion “at a young age.” The game allowed players to customize their avatar’s face with various sliders and options.
That led Deepfakes down a rabbit hole, all the way to the classic 1999 paper A morphable model for the synthesis of 3D faces which inspired Oblivion. The paper offered a technique that used a single 2D image to create a full 3D face. “Although most of the time the model is really bad and looks like you pasted someone’s face on a potato,” says Deepfakes. A 2009 paper, Face Reconstruction in the Wild, offered better insights–with enough 2D photos at enough angles, you could actually create a full, continuous 3D model. “I experimented with this method on and off for a while, but didn’t make anything useful,” says Deepfakes.
Finally, 2016 brought the meteoric rise of automated machine learning–systems that could improve upon themselves–which Deepfakes refers to as “steroids” on all the science that came before.
“I experimented a little bit and found out that I can easily generate very realistic faces of the same person with enough training data. The next question was, ‘what should I do with these generated faces?'” Deepfakes asks. “We still can’t reliably reconstruct a high-quality 3D face model from random images–[we’re] close, but not yet. However, I could try to swap two faces…” After about a year and a half of work, Deepfakes had created their face-swapping neural net and released it to the world through the open source code and proof-of-concept pornography.
Why they released it is quite simple, Deepfakes argues. “One, the algorithm is very, very easy to understand, and I don’t see any reason to hide it. Someone else will find a similar solution sooner or later,” they say. “Two, it is better to let the public know these algorithms exist, let people know how these algorithms work, now. Hiding this information feels immoral to me.”
The way Deepfakes’ tech is based on a relatively typical Generative Adversarial Network (GAN)–the sort used by basically every AI researcher in the world. Don’t think of a GAN as one AI but two, each vying to be the teacher’s pet. One AI, called the generator, draws faces while the other AI, called the discriminator, critiques them by trying to spot fakes. Repeating this process millions of times, both AIs continue to improve and make one another better in a battle that can be waged as long as your computer is plugged in. Deepfakes’ results are pretty convincing in about half a day of computing on a low-end PC.
Similar neural networks have had many wonderful, humanity-improving impacts. They can spot skin cancer early, or discover new drug treatments for the diseases of today. One main reason that voice recognition in Amazon Alexa or Google Home has suddenly gotten so good can be attributed to GANs, too.
But fundamentally, there’s a problem that happens when a machine trains another machine to create a piece of media that’s indistinguishable from the real thing. It creates content that, by design, can pass the tests of the highest possible professional scrutiny.
In fact, the imaging tools Deepfakes is using are already being studied by some of the biggest software companies in the world. One of the biggest, Adobe, has been working with AI image manipulation for over a decade.
When You Teach A Machine To Spot Fakes, You Help It Make Better Fiction
“The more techniques you develop to distinguish fact from fiction, the better the fiction becomes,” explains Jon Brandt, Director of the Media Intelligence Lab at Adobe Research, and the first AI researcher the company hired 15 years ago.
At Adobe, Brandt’s job isn’t to create future products but to spearhead and coordinate the bleeding edge IP with which they’ll be infused. For more than a decade, Adobe has owed many of its improvements to AI. In 2004, it introduced its first AI feature with automatic red-eye removal. Then face tagging in 2005. Now AI powers dozens of features from cropping and image searching to lip-syncing.
For the most part, these are handy updates to Adobe’s creative tools, gradually updating the features of its apps. But knowing where my questioning was leading him–that Adobe is the biggest, most profitable company in image manipulation, and single-handedly ushered in the reality distortion field of Photoshop that we all live in today–Brandt doesn’t mince words.
“The elephant in the room is Sensei,” he says. “It’s Adobe’s name around how we are approaching AI and machine learning for our business. It is not a product–which is frequently a source of confusion–but a set of technologies.”
Launched in 2016, in a coincidental parallel with the rise of fake news, Adobe’s Sensei projects makes Photoshop’s clone stamp tool look like a janky old VHS tape. Thanks to machine learning–and a small army of research interns Adobe recruits every summer to publish work at the company–Adobe can create a doppelgänger of your own voice, to make you say things you never said. It can automatically stitch together imaginary landscapes. It can make one photo look stylistically identical to another photo. And it can easily remove huge objects from videos as easily as you might drag and drop a file on your desktop.
These research experiments–which to be clear, haven’t been rolled out into Adobe products–aren’t just powerful media manipulators. They’re designed to be accessible, too. They don’t require years of expertise, honing your craft with artisanal tools. With the infusion of AI assisting the user, the apprentice becomes an instant master. Soon, Adobe plans to make image manipulation as simple as talking to your phone.
“It’s part of our mission to make our tools as easy, and enable our creatives to express themselves, as readily as possible,” says Brandt. “That’s developed a need for us to have more and more understanding of the context and content of what they’re working on, which requires a machine learning approach.”
When I ask if Brandt has come across Deepfakes, he says he has, and he wasn’t surprised to see it. After all, Deepfakes-based videos aren’t the first or only scarily false media produced by neural network techniques. This eerie video with Barack Obama giving a totally invented speech comes to mind. (The University of Washington researchers responsible for it declined to be interviewed for this piece.)
As Brandt points out, because the research community is so good at sharing their techniques, openly and at large, it’s almost inevitable for this power to become rapidly democratized.
“We can do the best to promote responsible use of our tools, [and] support law enforcement when these things are being used for illegal or nefarious purposes,” he says. “But there are people out there who are going to abuse it, that’s unfortunate. We can do [some things] from a tool perspective to limit that, but at the end of the day, you can’t stop people from doing it; they can do it in their living room.”
Whether it’s anonymous Redditors, big players like Adobe, or academia itself, progress is being made through all channels on advanced audio and visual manipulation, and no one is promising to hit the brakes. So how should we reckon with a dissolving reality? We simply don’t know yet.
The War On What’s Real
Did Deepfakes feel at all bad that celebrities were being placed in sexually explicit videos? Did Deepfakes worry that someone could use their machine learning technique to create pornography starring minors or high school classmates? When I ask Deepfakes about their own ethical code, they argue they’re not hurting anyone.
“I think it’s O.K. to create fakes privately, even of minors. Creating these fakes doesn’t really hurt anyone,” says Deepfakes. “Sharing these fake videos is a different story. I don’t think it’s O.K. to share fake porn of normal people. For the celebrity, how much do these fakes hurt their images is debatable. What if we create a realistic but non-exist person? For example someone between Natalie Portman and Keira Knightley, but it’s neither of them. The possibility is endless here, and I don’t have an answer.”
But Deepfakes admits that they didn’t consider these moral questions as they created videos, because to Deepfakes, they’re all obviously fake. “I’m more concerned that people might be fooled by someone else’s algorithm,” says Deepfakes. “Maybe some fake videos are already out there, and we don’t know it.”
Perhaps that sounds like scapegoatism–as if Einstein were to have denounced the Manhattan project for reasons like, “Someone would have built the atom bomb eventually. Ours really isn’t that big! Plus, I’m worried about someone building a bigger, more destructive one!” Yet the argument that a better version of the Deepfakes tech could be far more dangerous rings true for some.
“I will tell you, it started after Benjamin Button, I had people from government coming to me in defense, asking, “can this be used for evil?” Ulbrich recounts. “One request we got was having Ronald Reagan appearing at the RNC–which we passed on.” Ulbrich’s own ethical code put strong lines in the sand that he could enforce, simply because he was one of the few people holding the keys to this technology in the mid-aughts.
“‘Can you make Obama say other things?'” he poses as a hypothetical question from a hypothetical client. “‘Yes, but we won’t.'”
To Ulbrich, of course Deepfakes technology doesn’t look as good as Benjamin Button did years ago. But that doesn’t matter. The quality is still good enough to go viral on social media. And just as importantly, anyone can use Deepfakes’s code, without advanced skills or multi-million dollar Hollywood budgets. As Ulbrich said nine years ago, it’s only going to get better, cheaper, and faster.
“To me, this is big. I think this changes everything,” says Ulbrich. “Talk about fake news. You really can’t believe your eyes, and you can’t believe your ears. . .it subverts media.”
It’s easy to say that regulators should just shut this whole thing down, and deem Deepfakes-style videos illegal. But that’s tricky. Contemporary parody laws may protect the videos as freedom of speech or expression. Keep in mind that some real art is being made with Deepfakes today, too. A parody account called Derpfakes–which uses Deepfakes’ code–has hilariously put Nicholas Cage into movies like Raiders of the Lost Arc, and duplicated ILM’s placement of Carrie Fisher into The Force Awakens.
But furthermore, legality is almost a moot point when the faking technology itself is so widely democratized, and increasingly difficult to disprove. If everyone in the world has their finger on the equivalent of a fake news nuke button, the law cannot prevent explosions from going off. It can only attempt to punish those responsible when they do.
In this regard, Ulbrich and Deepfakes are actually in complete agreement. “We are in an age when anyone can create believable videos, and anyone can claim videos are faked,” says Deepfakes. “I think it is inevitable and the society must adapt to this change. It may be painful.”