AI-powered face-swapping technology awed the internet when it debuted last year at Siggraph, the annual computer graphics conference. The videos, known as “deep fakes,” quickly became an even more controversial–and painful–phenomenon as people used the algorithm to create porn videos using the faces of unwilling celebrities or private citizens. Even while the occasional artifact revealed the fakery, it was clear that the technology had torn down the wall between fiction and reality.
Now, a new iteration of the tech, called Deep Video Portraits technology, is debuting at this year’s Siggraph conference in August. The seven-minute video accompanying the paper, which was uploaded to YouTube this week, shows how the past year of research has pushed such technology to a reality-shattering new level.
After you manage to close your mouth, consider for a few minutes what you just saw. Heck, play it again. Twice if you need to.
The Stanford team behind the paper says that the outstanding realism in their work is “achieved by careful adversarial training, and as a result, [they] can create modified target videos that mimic the behavior of the synthetically-created input.” To get complete control of the target video, the researchers render a synthetic person entirely–as well as the background. They claim that they can perform any kind of recombination of parameters, like changing what an original person does or swapping faces entirely, “without explicitly modeling hair, body or background.” In other words, the AI takes care of everything by itself.
Their Siggraph 2018 reel feels truly indistinguishable from reality: from the subtle and natural alteration of facial expressions, to the incredible feat of achieving natural 3D movement automatically, to the fact that the subjects’ torsos now perfectly match their head movements. The software can also automatically reconstruct the background of each video, and even project a realistic shadow over against it, to fool the eye. It’s a technological feat that leaves me feeling that we’re witnessing a paradigm shift in humanity.
Indeed, as Mark Wilson wrote on Co.Design earlier this year, the end of reality feels now truly imminent. We’re on the verge of a complete reinvention of how we understand the moving image. Before, we took videotape as proof of fact–whether it showed someone committing a crime or a politician simply making a statement. Deep Video Portraits may be the penultimate step towards destroying the once-incontrovertible truth of video. Fixing this, whether with legal terms or with forensics, seems nigh on impossible. Like any other piece of software, its distribution and usage is unstoppable.
Deepfakes, the anonymous Redditor who created the code that led to the proliferation of videos last year, spoke to Co.Design about their predictions for the software in March. According to them, there’s no way to stop it. The only thing we can do is raise awareness that this technology exists: “One, the algorithm is very, very easy to understand, and I don’t see any reason to hide it,” Deepfakes said at the time. “Someone else will find a similar solution sooner or later. Two, it is better to let the public know these algorithms exist, let people know how these algorithms work, now. Hiding this information feels immoral to me.”
The only problem is that knowing about it doesn’t really change anything. If you can’t trust what you see, how can you form a worldview? Resistance, I’m afraid, is futile. Just make some popcorn and enjoy the ride to hell.