Facebook develops new method to reverse-engineer deepfakes and track their source

Deepfakes aren’t a big problem on Facebook right now, but the company continues to fund research into the technology to guard against future threats. Its latest work is a collaboration with academics from Michigan State University (MSU), with the combined team creating a method to reverse-engineer deepfakes: analyzing AI-generated imagery to reveal identifying characteristics of the machine learning model that created it.

The work is useful as it could help Facebook track down bad actors spreading deepfakes on its various social networks. This content might include misinformation but also non-consensual pornography — a depressingly common application of deepfake technology. Right now, the work is still in the research stage and isn’t ready to be deployed.

The method could help track down those spreading deepfakes online

Previous studies in this area have been able to determine which known AI model generated a deepfake, but this work, led by MSU’s Vishal Asnani, goes a step further by identifying the architectural traits of unknown models. These traits, known as hyperparameters, have to be tuned in each machine learning model like parts in an engine. Collectively, they leave a unique fingerprint on the finished image that can then be used to identify its source.

Identifying the traits of unknown models is important, Facebook research lead Tal Hassner tells The Verge, because deepfake software is extremely easy to customize. This potentially allows bad actors to cover their tracks if investigators were trying to trace their activity.

“Let’s assume a bad actor is generating lots of different deepfakes and uploads them on different platforms to different users,” says Hassner. “If this is a new AI model nobody’s seen before, then there’s very little that we could have said about it in the past. Now, we’re able to say, ‘Look, the picture that was uploaded here, the picture that was uploaded there, all of them came from the same model.’ And if we were able to seize the laptop or computer [used to generate the content], we will be able to say, ‘This is the culprit.’”

Hassner compares the work to forensic techniques used to identify which model of camera was used to take a picture by looking for patterns in the resulting image. “Not everybody can create their own camera, though,” he says. “Whereas anyone with a reasonable amount of experience and standard computer can cook their own model that generates deepfakes.”

Not only can the resulting algorithm fingerprint the traits of a generative model, but it can also identify which known model created an image and whether an image is a deepfake in the first place. “On standard benchmarks, we get state-of-the-art results,” says Hassner.

Deepfake detection is still an “unsolved problem”

But it’s important to note that even these state-of-the-art results are far from reliable. When Facebook held a deepfake detection competition last year, the winning algorithm was only able to detect AI-manipulated videos 65.18 percent of the time. Researchers involved said that spotting deepfakes using algorithms is still very much an “unsolved problem.”

Part of the reason for this is that the field of generative AI is extremely active. New techniques are published every day, and it’s nearly impossible for any filter to keep up.

Those involved in the field are keenly aware of this dynamic, and when asked if publishing this new fingerprinting algorithm will lead to research that can go undetected by these methods, Hassner agrees. “I would expect so,” he says. “This is a cat and mouse game, and it continues to be a cat and mouse game.”

Leave a Comment Cancel Reply