Scammers use voice-generating AI to extort money
Over the past few months, generative AI has made huge strides and, not surprisingly, is being used by criminals who can use AI to fake a voice to convince the owner’s family that a person needs financial help.
Washington Post informsthat a Canadian couple recently received a call from what sounded like their grandson claiming he was in prison and needed bail money. They had withdrawn up to C$3,000 from one bank and were about to withdraw the same amount from another bank when the manager told them they were being scammed. It turned out that another bank customer also received a similar call and later found out that he had been scammed.
Another couple who were not so lucky were Benjamin Perkin’s parents. They received a call from a lawyer saying that their son had killed an American diplomat in a car accident and was in prison and needed money for legal aid. The lawyer put Benjamin on the phone, who said he loved them and appreciated the financial help.
The voice sounded “similar enough for my parents to believe they were actually talking to me,” said Benjamin. His parents sent $15,449 to the scammer via a bitcoin terminal and were unable to get it back.
Voice fraud is nothing new. Data from the U.S. Federal Trade Commission reveals that of the 36,000 reports last year where people were scammed by criminals claiming to be friends or family, more than 5,100 of those incidents happened over the phone.
Spoofing a person’s voice used to be a complicated and lengthy procedure, requiring the discovery and collection of many hours of audio recordings, and the end result was not always very convincing. Today, however, artificial intelligence tools have made this process so easy that scammers only need a small clip of a person’s speech, most often posted on social media, to accurately recreate that person’s voice.
An example of this technology is a tool Vall-E from Microsoft, which the company announced in January. Based on a technology called EnCodec, which Meta announced in October 2022, it works by analyzing a person’s voice, breaking down the information into its components, and using a trained algorithm to synthesize what the voice would sound like if it were saying different phrases. Even after listening to just a three-second sample, Vall-E can reproduce the speaker’s voice timbre and emotional tone. On this page GitHub, you can check for yourself how convincing it is.
First, we’ve always had the ability to trace any generated audio clip back to a specific user. We’ll now go a step further and release a tool which lets anyone verify whether a particular sample was generated using our technology and report misuse. This will be released next week
— ElevenLabs (@elevenlabsio) January 31, 2023
Founded by Piotr Dąbrowski and Mati Stanisławski, an American startup Eleven Labs that offers a generative voice creation tool Prime Voice AI, recently tweeted that he was seeing “a growing number of voice cloning abuse cases”. This led to the removal of voice cloning from the free beta software.