As AI-generated videos spread disinformation, start-ups and academics battle to stay one step ahead.
Matteo Renzi, Italy's former prime minister and founder of the new Italia Viva party, sits in an opulent-looking office, face to the camera. An oil painting hangs to one side of him, on the other sits a Renaissance bust.
A technician ducks into shot to check his sound levels, and then Renzi is off. He starts by gurning at an off-screen audience member, greeting them in a hoarse stage whisper.
Then he turns on his fellow politicians. Giuseppe Conte, the current prime minister; Luigi Di Maio, his deputy; Carlo Calenda, a member of the European parliament — all receive the same obscene arm gesture, punctuated with a little sneer.
This performance sent some Italians straight to Twitter to voice their outrage at the ex-prime minister's diatribe. But it was not the real Renzi talking. On closer examination, the voice is different, as are the gesticulations. Even the face looks uncannily smooth.
That's because the politician's features have been algorithmically transplanted on to a comedian's, as part of a skit for Striscia la notizia, a long-running Italian satire show. The video is the latest in a series of examples of how "deepfake" technology — or AI-generated videos designed to fool humans — has started to affect politics.
Only a few years ago, such "deepfakes" were a novelty, created by hobbyist coders. Today, they are increasingly commodified as yet another service available to those with even a little disposable cash.
While they may be increasingly cheap to pull off, their repercussions could be far-reaching. Fraudulent clips of business leaders could tank companies. False audio of central bankers could swing markets. Small businesses and individuals could face crippling reputational or financial risk.
READ MORE:
• Creating deepfake porn to be 'easy as using Instagram filters'
• 'We are outgunned': Top AI researchers race to detect 'deepfake' videos
• Viral Chinese app takes deepfakes to a terrifying level
• Premium - Deepfakes: Weta helping Hollywood create the perfect digital human
And, as elections approach in the US, UK and elsewhere, deepfakes could raise the stakes once more in the electorate's struggle to know the truth.
Around the world, start-ups, academics and lawmakers are rushing to create tools to mitigate these risks. But the technology itself is developing faster than anyone imagined.
Hany Farid, a professor at the University of California, Berkeley, has spent decades studying digital manipulation: "In January 2019, deepfakes were . . . buggy and flickery. Nine months later, I've never seen anything like how fast they're going. This is the tip of the iceberg."
On one thing, experts are clear. The mere risk of a deepfake undermines a most basic principle of humanity: can you believe your eyes?
Disinformation is as old as politics, and its practitioners have kept pace with technological changes. Where written fake news was the hallmark of the most recent election cycle in the US and UK, images and videos are increasingly the new focus of propaganda, says Vidya Narayanan, a researcher at the Oxford Internet Institute.
"[They] are powerful in shaping a situation. If you see an image, it is very immediate." Software such as Photoshop was used to create a widely shared fake image of Emma González, a survivor of the Parkland shooting and a gun control activist, ripping up the US Constitution in 2018.
Altered videos are not exactly new. The most famous recent example is an edit of a slowed Nancy Pelosi speech from earlier this year. The video spread across conservative media as critics of the Speaker of the House of Representatives declared it evidence of her senility, alcoholism or a mental health problem.
Rudy Giuliani, US president Donald Trump's personal lawyer, retweeted the video, before deleting it but defending his choice. Trump himself posted a different altered video of Pelosi, which is still online: it has nearly 30,000 retweets and more than 90,000 likes. The difference with a deepfake is that with an algorithm in charge, the results can be much more convincing.
The technology that powers deepfakes, known as Generative Adversarial Networks, was only invented in 2014. GANs are made up of two rival computer networks. A synthesiser creates content, which the detector or discriminator compares to images of the real thing.
"Let's say the synthesiser places someone's face on to someone [else]'s face," says Farid. "The detector says there's an artefact [a distortion in the image], do it again." Through hundreds of thousands of cycles of trial and error, the two systems can create immensely lifelike videos.
This has been the year that saw deepfakes move beyond the hands of those with powerful computers, graphics cards and at least some technical expertise. The DeepNude app was released in June and the ZAO app in August.
The former, now shut down, produced realistic female nudes from clothed photographs, leading to understandable outrage. The latter allowed users to plaster their faces over the protagonists of a selection of movies simply by uploading a few seconds of video to the free Chinese app. "These things are absolutely getting democratised, and it's happening really rapidly," Farid says.
He is not alone in expressing shock at the rate of development from academic concept to easily accessible reality. "We knew it was coming, but not nearly this fast," says David Doermann, a professor at the University of Buffalo.
Like Farid, Doermann has been in the field of computer vision and image processing for more than two decades, and is an adviser for video-verification app Amber. "It's hard to predict where [deepfakes will] go in the next five years, given they've only been around for five years."
Making a deepfake from scratch is increasingly simple. Making a good deepfake, however, is another matter. The more powerful the computer and the graphics card, the more cycles a GAN can run through and the better the results. On top of that, many of the best-looking deepfakes have been professionally touched up afterwards.
Given these limitations, it is unsurprising that a market has started to emerge. A Japanese start-up called Deepfakes Web is charging $2 an hour of processing time to create videos. On Fiverr, an online marketplace connecting freelancers with jobs, a user with the name Derpfakes offers to put customers' faces into movie clips.
Face swapping may be the most common form of deepfake, but others are more ambitious. Ricky Wong, one of the co-founders of a start-up called Humen, explains that with three minutes of footage of movement and material from professionals, his company can make anyone "dance". "We're trying to bring delight and fun to people's lives," he says. "Not something like a Nazi salute, that would be horrible."
Meanwhile, audio deepfakes are also on the rise. Modulate, a start-up based in Boston, is creating "audio skins", real-time voice changers for use in video games. "There's a lot of people who spend a lot of time and money building up their persona in games," says Mike Pappas, the company's co-founder and chief executive.
"Your normal voice breaks that illusion that you've spent so much time crafting." Part way through our phone conversation, Pappas changes to a woman's voice, and then to a co-worker's: it comes across as a little stiff but still recognisably human.
Pappas acknowledges the risks of impersonation. In August, The Wall Street Journal reported on one of the first known cases of synthetic media becoming part of a classic identity fraud scheme: scammers are believed to have used commercially available voice-changing technology to pose as a chief executive in order to swindle funds.
As services such as Modulate grow, the number of legal cases is likely to go up. Pappas says Modulate screens requests to avoid impersonation. "We've landed on the fact that it's important to be able to sleep at night," he says. The company also places a digital watermark on its audio to reduce the risk of a voice skin being recognised for the real thing.
As Henry Ajder walks through the nearly 600-year-old grounds of Queens' College, Cambridge, he describes a daily routine that involves tracking the creation and spread of deepfake videos into the darkest corners of the internet.
Ajder's job as head of communications and research analysis at start-up Deeptrace has led to him investigating everything from fake pornography to politics. In a report Deeptrace released last month, the scale of the problem was laid bare: the start-up found nearly 15,000 deepfakes online over the past seven months. Of these, 96 per cent were pornographic.
One of Ajder's political cases looked at whether or not a deepfake may have contributed to an attempted coup in Gabon. Ali Bongo Ondimba, the president of the African nation, was taken ill in October last year and has been in Morocco since then, with little information released on his health. Then, in December, a surprise video of him was released, prompting speculation from political opponents.
"It just looked odd: the eyes didn't move properly, the head didn't move in a natural way — the immediate kind of response was that this is a deepfake," says Ajder. A week after the video was released, junior officers attempted a coup d'état, which was quickly crushed.
In fact, Deeptrace did not find evidence of manipulation but, for Ajder, that may be irrelevant. What was important was the uncertainty created. "Even before these videos become very good or very widespread, we are already seeing the spectre of deepfakes haunting people," he says. "It really drives how powerful the mere doubt is . . . about any videos we already want to be fake."
Kanishk Karan, a researcher at the Digital Forensic Research Lab, part of the US think-tank Atlantic Council, points to another potential deepfake, this time in Malaysia: a video alleging to show economic affairs minister Azmin Ali in a tryst with another minister's male aide.
Given Malaysia's colonial-era laws and persistent discrimination against LGBT communities, the footage, released in June, naturally provoked controversy. "A lot of people were saying it's a deepfake version of him," says Karan. "On the other side, the opposition is saying that it's not a deepfake, it's a real confession." To date, the scandal has not toppled Ali.
Deepfakes may be particularly destructive in countries such as India or Brazil. In both, there is heavy use of WhatsApp, a platform that lends itself to videos and images and whose closed nature also comes with a sense of security and trust.
Both countries have large populations without basic literacy, Narayanan of the Oxford Internet Institute points out, making it difficult to generate media literacy. As is often the case with disinformation, vulnerable populations are most at risk.
"The internet represents the peak of the fourth industrial revolution. These are communities that haven't reaped the benefits of the first one — they do not have the know-how to begin to understand that a computer can create this," she says.
Those creating the technology to fight deep-fakes on the ground are divided into two broad camps. The first is detection, identifying fake videos and images as they emerge. Deeptrace is one of the companies in that space, explains chief executive Giorgio Patrini, as he calls from the company's Amsterdam headquarters to demonstrate its system.
The video on my screen looks rather like an earlier version of Windows Movie Maker but in navy corporate colours. On the right-hand side are four videos. Among them is the now-famous deepfake of Mark Zuckerberg produced for a political art installation and another of Rowan Atkinson's face superimposed on Donald Trump's body.
The video Patrini drags over to the left-hand side is unfamiliar, however, and comes from a Taiwanese channel on YouTube. It features a young woman smiling and talking to the camera. Over the connection, the video comes across as slightly choppy, but hardly enough to suggest something is out of the ordinary.
When he hits play, a red box playing over her features, flashing percentages, reveals that it is a fake: a fan wearing a K-Pop singer's face. "We've seen a couple of things that are coming out on Asian markets and are being sold for pennies on these digital marketplaces," says Ajder. This video is harmless, but Patrini says that (female) K-Pop singers have become major targets of fake porn.
Patrini explains that Deeptrace's technology is trained on the thousands of deepfakes that the company has pieced together from across the internet. "We have, to the best of our knowledge, the largest network of fake videos out there," he says.
The job is a continuous one, though. "Things a year ago don't compare," he says. "We see thousands of people contributing to small tweaks to the technology on GitHub, doing it as a hobby."
Farid, the Berkeley professor, is also working on detection, focusing primarily on public figures, including world leaders. His system analyses hours of video of their conversations, including interviews, speeches and rallies. From there, it focuses on the specific idiosyncrasies of their speech patterns and expressions.
"When Obama would deliver bad news, he would frown, his brow would furrow and he would slightly tilt his head downwards," he says. "If he was being funny he would smile and look up to the left . . . Everyone has a different cadence to how their expressions change."
For now, Farid thinks he is one step ahead of the deepfake producers. "I'm analysing eyebrows and head movements," he says. "The GANs don't know that there's a face there." In the long term, however, he is pessimistic. "It's an arms race and, at the end of the day, we know we're going to lose — but we're going to take it out of the hands of the amateur and move it into the hands of fewer people."
Dr Wael Abd-Almageed, a senior scientist at the University of Southern California, represents yet another attempt at detection. His work strings together several video frames to verify whether an image is a deepfake. Nevertheless, he is quick to acknowledge that his research may unintentionally feed into improving future deepfakes.
"My anticipation is the people who create them will see our paper and try and improve their methods to fool our detector," he says. "If you think deepfakes are a problem now, they will be much harder in the next couple of years."
The second method of combating deepfakes focuses on improving trust in videos. Truepic, a San Diego-based start-up, has been trying to fight manipulated videos and photos for four years, with experts such as Farid on its advisory board.
Jeffrey McGregor, Truepic's chief executive, says the company launched in response to a spate of manipulated pictures online. "Deepfakes will forever be generated . . . What Truepic is aiming to do instead of detecting them, is establishing truth."
Truepic has produced a camera app for everyday usage. "When you tap on that shutter button, we're capturing all of the geospatial data — GPS sensors, barometric pressures, the heading of the device and securely transforming that to Truepic's verification server."
There, the company runs tests to check whether the image has been manipulated. If it has not, Truepic uploads a verified version to its website, which can be shared with other parties.
McGregor says that Truepic has already found business uses with insurers and lenders. The company is also working with NGOs who have a particular need for verified images. "One example is the Syrian American Medical Society — they've used Truepic to document some of the events that are happening in Syria," he says.
Amber, a San Francisco-based start-up, produces detection software as well as Amber Authenticate, a camera app that generates "hashes" — representations of the data — that are uploaded to a public blockchain as users shoot a video.
If the veracity needs to be checked — for example, in a courtroom — any differences between hashes can show whether it has been tampered with. "Video or audio being used as evidence should not be operating in probabilities," says Shamir Allibhai, Amber's CEO.
Yet while entrepreneurs and academics can produce software to fight deepfakes, social-media giants must also tussle with them. YouTube told the FT that it was aware of the issue and was working on it. The video platform did remove the altered video of Pelosi.
It remains unclear, however, what policies it might invoke that could stop users taking parody videos and reposting them as if they were real, as with the Renzi deepfake.
Of the Big Tech companies, it is Facebook that has started to take the lead in looking for technical solutions. In September, it announced the launch of the Deepfake Detection Challenge alongside Microsoft, academics in the US and UK, and an industry consortium called the Partnership on AI.
"Better late than never," says Farid, who is among the scientists involved. "Yes, this is good — YouTube and Twitter should be doing this too — but there's a second part, the policy issue." He points to the altered video of Nancy Pelosi uploaded to Facebook as a prime example of this dimension. "Facebook knew it was fake within seconds. They also said, 'We are not the arbiters of truth.'" The video stayed up.
In its refusal to act, Facebook strived to stay within the limits of the Communication Decency Act, Section 230 — legislation from 1996. It counts websites as platforms rather than publishers, to promote free speech, but has come under increasing criticism for seeming to enable companies to avoid liability for the content they host. The result is an often piecemeal approach to content issues.
Electoral systems are also lagging. A spokesperson for the UK's Electoral Commission said that deepfakes are just one challenge posed by the rise of digital campaigning. While printed material is required by law to have imprints showing authorship, this does not apply to electronic content — a potentially dangerous loophole.
One way to deal with this would be through enacting clear regulation. Mutale Nkonde, a fellow at the Berkman Klein Center at Harvard University, was among those involved in helping draft the Defending Each and Every Person from False Appearances by Keeping Exploitation Subject (DEEPFAKES) to Accountability Act.
"It became incredibly important to enter a piece of legislation," she says. "As we move towards 2020, we may be subject to supposed video evidence and we need a way of identifying what may look real [but is not]." She says that there are fears that both China and Iran could turn to deepfakes as a tool to attack the US.
Yet these dangers have to be dealt with in the framework of Section 230. The compromise for Nkonde and her colleagues was to treat deepfakes as a consumer-rights issue, making it about fraudulent representation.
The DEEPFAKES Accountability Act, referred to the subcommittee on Crime, Terrorism and Homeland Security in June, would make deepfakes for purposes such as fake porn, disinformation or election interference illegal.
Those synthetic videos produced for purposes such as parody or education would need to be watermarked. But Nkonde says that even as someone who helped draw up the bill, she now questions its feasibility.
"The issue with watermarking . . . is the technical architecture completely changes the video," she says. "It's a completely new piece of video." Trying to prove something is a fake without reference to the "real" footage would be extremely hard. She also worries that watermarking would lead to false positives, or that canny developers could try to have real videos flagged as deepfakes.
"We may end up having to actually favour some type of ban or moratorium until we get further research in all the different ways [videos] could be falsified," she suggests. "We're falling foul to how fast tech is moving."
While the rate of progress is astounding, experts remain unconvinced about a deepfake apocalypse in the political sphere. It is the plausible deniability the technology offers that remains its greatest power.
Doermann at the University of Buffalo says that in the US at least, where public awareness of the technology is growing, an extremely high-quality deepfake would be needed to change the course of electoral history in 2020. "It would take a massive amount of computing power. It's not going to be a rogue person, it would be a nation state."
Ajder is also willing to admit that as with other advances in AI, scepticism around the technology is fair. But he is certain, nevertheless, that deepfakes, as a form of disinformation, are dangerous. "They appeal to a different kind of truth," he says. "The concept of truth has never been as solid as we like to think."
Voters who see a video of a politician behaving in a way they expect them to might understand it is a fake, but believe it represents an underlying reality to their character. "We no longer have the luxury of [deciding] when we can suspend our reality," Ajder concludes.
That is evident in the response to the altered video of Pelosi. Many comments on The Washington Post's explainer on YouTube reflect the view that the specifics of the video do not matter. "She still sounds drunk and really messed up at normal speed . . . can't hear no big difference," says one user. That comment received 564 likes.
Written by: Siddharth Venkataramakrishnan
© Financial Times