Today we will talk about deepfake. We tend to rely on the content of audio and video recordings. But with Artificial Intelligence (AI), anyone's face or voice can be recreated accurately. The product is false, an impersonation that can be used for memes, erroneous information or pornography.
A look at the deepfake of Nicholas Cage, or the fake PSA of Jordan Peele makes it clear that we are dealing with strange new technologies. These examples, although relatively harmless, raise questions about the future.
Can we rely on audio and video? Can we hold people accountable for their actions on screen? Are we ready for deepfake?
Deepfake is new, easy to make and growing rapidly
Deepfake technology is only a few years old, but it has already exploded into something that is captivating and disturbing. The term «deepfake», which was coined in a Reddit thread in 2017, is used to describe the recreation of a human's appearance or voice through artificial intelligence.
Surprisingly, almost anyone can create a deepfake with a bad PC, some software and a few hours of work.
As with any new technology, there is some confusion around deepfake. The video «Pelosi drunk» is an excellent example of this confusion. Deepfakes are built by AI, and are made to impersonate people.
This video, which is known as deepfake, is actually just a video by Nancy Pelosi that has slowed down and corrected the tone of her voice to add a dragged speech effect.
This is also what makes this technique different from, say, CGI Carrie Fisher in Star Wars: Rogue One. While Disney spent lots of money studying Carrie Fisher's face and recreating it by hand, a nerd with some software Fake can do the same job for free in a single day. AI does the work incredibly simple, cheap and compelling.
How to make a deepfake
Like a student in a classroom, AI has to "learn" how to perform its intended task. It does so through a brute force trial and error process, generally known as machine learning or deep learning.
An AI designed to complete the first level of Super Mario Bros, for example, will play again and again until you discover the best way to win. The person who designs AI needs to provide some data to get started, along with some "rules" when things go wrong along the way. Other than that, AI does all the work.
The same goes for deepfake facial recreation. But, of course, recreating faces is not the same as beating a video game. If we were to create a fake Nicholas Cage as host of the Wendy Williams show, this is what we would need:
A destination video: From now on, deepfakes work best with clear and clean destination videos. That is why some of the most compelling deepfakes are politicians; They tend to stand still on a podium under constant lighting. So, we just need a video of someone sitting and talking.
Two sets of data: so that the movements of the mouth and head are accurate, we need a data set of Wendy Williams' face and a data set of Nicholas Cage's face. If Wendy looks to the right, we need a photo of Nicholas Cage looking to the right. If Wendy opens her mouth, we need a picture of Cage opening her mouth.
After that, we let AI do its job. Try to create the deepfake again and again, learning from your mistakes along the way. Simple truth? Well, a video of Cage's face on Wendy William's body is not going to fool anyone, so how can we go a little further?
How to hurt with a deepfake
The most compelling (and potentially harmful) deep forgeries are total impersonation. The popular Obama fake of Jordan Peele is a good example. So, let's do one of these impersonations. We believe a deep imitation of Mark Zuckerberg declaring his hatred of ants, that sounds convincing, right? This is what we will need:
A destination video: this could be a video of Zuckerberg himself or an actor who looks like Zuckerberg. If our target video is from an actor, we will simply hit Zuckerberg's face on the actor.
Photo data: we need photos of Zuckerberg talking, blinking and shaking his head. If we are superimposing his face on an actor, we will also need a set of data on the actor's facial movements.
Zuck's voice: Our deepfake should sound like "The Zuck." We can do this by recording an imitator or recreating Zuckerberg's voice with AI. To recreate his voice, we simply run audio samples from Zuckerberg through an AI like Lyrebird, and then write what we want him to say.
A lip sync AI: Since we are adding the fake Zuckerberg voice to our video, a lip sync AI must make sure that the facial movements match what is said.
We are not trying to minimize the work and experience involved in doing a deepfake. But compared to the million-dollar CGI job that brought Audrey Hepburn from the dead, deep fakes deepfakes are a walk in the park.
And although we have not yet fallen in love with a political deepfake or celebrities, even the deepest and most obvious have caused real damage.
Deepfakes have already caused damage in the real world
As of now, most deepfakes are just Nicholas Cage memes, public service announcements and creepy celebrity porn videos. These outputs are relatively harmless and easy to identify, but in some cases, deepfakes are used successfully to spread misinformation and damage the lives of others.
In India, Hindu nationalists use deepfakes to discredit and incite violence against women journalists. In 2018, a journalist named Rana Ayyub was the victim of a disinformation campaign, which included a fake video of her face superimposed on a pornographic video. This led to other forms of online harassment and the threat of physical violence.
In the United States, deepfake technology is often used to create revenge porn without consensus. As reported by Vice, many users in the Reddit deepfakes forum, now banned; They asked how to create deep fakes of ex-girlfriends, lovers, friends and classmates. Yes, child pornography.
The problem is so great that the state of Virginia now prohibits all forms of non-consensual pornography, including deepfakes.
As the deepfakes become increasingly convincing, the technology will undoubtedly be used for more dubious purposes. But there is a possibility that we exaggerate, right? Isn't this the most natural step after Photoshop?
Deepfakes are a natural extension of manipulated images
Even at its most basic level, deepfakes are disturbing. We rely on audio and video recordings to capture people's words and actions without prejudice or misinformation. But in a way, the threat of these fakes is not new at all. It exists since we started using photography.
Take, for example, the few photographs that exist of Abraham Lincoln. Most of these photographs, including portraits on the penny and the five-dollar bill were manipulated by a photographer named Mathew Brady to improve Lincoln's thin appearance; specifically his thin neck.
Some of these portraits were edited in a way reminiscent of deepfakes, with Lincoln's head superimposed on the bodies of "strong" men like Calhoun.
This sounds like strange publicity, but during the 1860s, the photograph had a certain "truth" that we now reserve for multimedia recordings. It was considered the opposite pole of art: a science.
These photos were manipulated to intentionally discredit newspapers that criticized Lincoln for his weak body. In the end, it worked. The Americans were impressed by the figure of Lincoln, and Lincoln himself claimed that Brady's photos "made me president."
The connection between deepfakes and the 19th-century photo editing is strangely comforting. The narrative offers us that, while this technology has serious consequences, it is not something that is completely beyond our control. But, unfortunately, that narrative may not last long.
We can't detect Deepfakes forever
We are used to detect fake images and videos with our eyes. It is easy to look at the portrait of Joseph Goebbels' family and say, "There is something strange about that guy in the back."
A look at North Korea's propaganda photos makes it clear that, without YouTube tutorials, people stink in Photoshop. And despite how impressive deepfakes are, it is still possible to detect a deepfake only with the naked eye.
But we can't detect them for much longer. Each year, deepfakes become more convincing and even easier to create. You can make a fake with just one photo, and you can use an artificial intelligence like Lyrebird to clone voices in less than a minute.
The high-tech deepfakes that combine fake audio and video are incredibly compelling, even when they are made to mimic recognizable figures like Mark Zuckerberg.
Anti deepfake technology is not adequate
In the future, we can use artificial intelligence, algorithms and blockchain technology to fight these fakes. Theoretically, AI could scan videos to look for fake "fingerprints," and blockchain technology installed in operating systems could point to users or files that have touched fake software.
If these anti-deepfake methods seem stupid, join the club. Even AI researchers doubt that there is a true deepfake solution. As the detection software improves, so will this technology.
Eventually, we will reach a point where it will be impossible to detect the deepfake. And we'll have much more to worry about than fake celebrity porn and Nicolas Cage's videos.