
How Hacks Happen
Hacks, scams, cyber crimes, and other shenanigans explored and explained. Presented by cyber security teacher and digital forensics specialist Michele Bousquet.
How Hacks Happen
AI Part 2: Recognizing Scams
AI makes our lives more convenient in so many ways, but it also gives scammers the same conveniences. Host Michele Bousquet explains how scammers use AI audio, video, and text to better fool their victims.
Everyday AI: Your daily guide to grown with Generative AICan't keep up with AI? We've got you. Everyday AI helps you keep up and get ahead.
Listen on: Apple Podcasts Spotify
Join our Patreon to listen ad-free!
AI Part 2: Recognizing AI Scams
Welcome to How Hacks Happen. This is part 2 of a 3-part series on artificial intelligence, otherwise known as AI. In this episode, we’ll look at all the different ways AI can be used for scams.
In the last episode, we talked about Deepseek, and about some of the things that make an AI model desirable. But when it comes to using AI for scams, what it comes down to is, can an AI model mimic human behavior well enough to fool people?
And to that I would say, Yes. Yes, it can. But I’ll let you be the judge of that, as we take a little tour of the types of AI used in scams, and how they work, and I’ll also provide a few examples for you to listen to and decide for yourself.
Let’s start with a little demonstration. For this episode, I actually cloned my own voice using AI. You’ll get to hear that cloned voice in a few seconds. Right now, this is actually me talking.
And now, this is my AI voice clone speaking. Would you be able to spot the difference? I had to generate this clone about 10 times with different settings to get one that was really close. Let's go back to the real Michele for a review.
Okay, it’s me again. Wow, that was pretty good. I can hear the difference, but I’m not sure other people could tell. The telltale signs are there, but they’re really subtle. There’s some emphasis on certain syllables that aren’t what I would have chosen, but I’m maybe the only person who could tell that.
This episode is dedicated to showing you how to spot AI-generated voices, text, and videos. Sometimes it’s easy, sometimes it’s impossible. But for those impossible times, there are other things you can do to spot when something is fake.
Have you ever met someone who can imitate other people’s voices perfectly? Or seen an impressionist on TV who does this. What these people do is mimic several things about the other person’s voice–the pitch, the pace, how they form their vowels, how long they linger over certain consonants, and even how their pitch or their pace changes when they’re feeling different emotions.
Voice cloning tools do the same thing, but they use AI to do it. These tools analyze recordings of a real person’s voice and turn it into a bunch of numbers that represent pitch and pace and everything else about the voice.
Once the analysis is done and the cloned voice is ready, you can then use voice synthesis software to make that voice speak. You just type in some text, and the software generates a sound file of the voice saying those words, which you can save in a file, like an MP3 file.
Then you can use this file however you like. Like, you could email or text it to someone, or put it in a podcast. That’s exactly what I did to get the sound snippet you heard earlier. It’s that easy.
To be clear, there’s two things going on here: there’s voice cloning, which is analyzing sound files of a person’s voice, and then there’s voice synthesis, which is taking text that you type in and turning it into audio of that voice speaking.
You might hear these terms kind of used interchangeably a little bit, because often the software that does the analysis, the cloning, also does the synthesis. So it’s kind of all one big happy family of tools.
This use of AI-generated voices has been around for a while. If you spend a lot of time on YouTube, you might have seen videos where there’s a voice reading stories from the social media site Reddit, like this one.
MALE VOICE: Lawyers of Reddit, what was your “I won the case” moment? When I found a video on Facebook of the plaintiff squatting 300 pounds the month before his deposition.
I find these videos to be great for keeping me company when I’m doing housework, or even, they help me doze off at night. The speech is usually pretty good, but there’s always a little telltale sign in there that this is an AI-generated voice, whether it’s the lack of emotion, or how it pronounces certain words. For example, it often pronounces L-E-A-D as “leed” even if the context tells us otherwise.
MALE VOICE: Their mother was concerned about the levels of lead paint in the apartment.
But AI voice generation is getting better all the time, by leaps and bounds. Just a couple of years ago I would regularly hear Facebook pronounced “fassebook,” and AI regularly mangling place names that don’t have phonetic spellings, like Worcester, Massachusetts or Poughkeepsie, NY. But I tested both of those place names this week, and Facebook too, and my voice synthesis software pronounced all of these correctly.
In all the software I looked at, the voice synthesis part of the software comes with a few voices that have already been cloned. These are clones of real people’s voices that they include for you to use as you like. And those YouTube channels I was talking about, they often use one of these canned voices. It happens so often that I’ve started to recognize which ones they use over and over. There’s two in particular, there’s a proper-sounding older British gentleman…
MALE VOICE: I am a proper British gentleman.
…and a cheerful young woman.
FEMALE VOICE: I am a cheerful young woman.
There’s nothing wrong with this use of cloned voices because the YouTuber isn’t trying to fool or defraud anybody. They’re just trying to generate some content for their channel, and for whatever reason, they don’t want to use their own voice. Maybe they think their voice isn’t too great, or they have a thick accent, or they don’t have good recording equipment, any of these things. It’s no big deal.
These kinds of channels are called headless YouTube channels because there’s no talking head. A lot of people hate them, but I kind of like them. For one thing, listening to a lot of them made me a lot better at spotting AI-generated voices.
You heard earlier how I cloned my own voice, but that has only been possible within the last few years, to be able to clone a voice that belongs to a specific person. I used a tool called ElevenLabs to do this, and it only took a few minutes, I’m not kidding. It’s kind of crazy. I’ll go over that whole process in the next episode. But I just gotta tell you, it was really fun, and now I have a clone all my own.
But anybody can do it with just two or three minutes of your recorded voice. So you could, potentially, clone a celebrity’s voice, or the voice of someone you’ve never met, just by grabbing an audio file or a YouTube video. And this is where things get a little scary, because this AI voice cloning software is free or cheap to use, which means scammers can do it, too, and they have no qualms about using voice cloning software to dip their hands right into your pockets and grab your money.
One voice-cloning scam that has been making the rounds for a few years now, is when an elderly person gets a phone call with a panicked voice on the other end of the line.
MALE VOICE: Grandma, I’m in jail! Please help me. And don’t tell my parents, please!
And the voice sounds exactly like her grandson. Then a different voice comes on the line…
MALE VOICE: This is the Sheriff of Delaware. Your grandson is being detained on suspicion of armed robbery. If you want to get him out of jail, I need you to give me $2000.
Or sometimes they say they’re a criminal defense lawyer…
MALE VOICE: I’m representing your grandson, and I need a retainer of $2000...
No matter who it is, they tell Grandma that she’s gotta pay $2000 right away, like right now, or her precious little bobblehead will be spending the weekend sharing a 10x10 cell with a meth dealer and a serial killer.
If Grandma doesn’t know about voice cloning, this kind of scam call can be really convincing.
Let’s start with the first bit, when the grandson is supposedly on the line. It’s not actually the grandson, of course. What happened was, the scammers found some audio or video of the grandson online, and they cloned his voice from that couple of minutes of audio. And now they’re using the voice clone to convince Grandma that her precious little bobblehead is in the clink. I can imagine her reaction would be almost like a gut punch, to hear her precious grandson in such a panic! Any good Grandma would feel the same, right?
Another aspect of this scam is that Grandma came of age in an era when long-distance calls cost a lot of money, and during that time, phone scams were virtually nonexistent. Then in the early 2000s we got internet calling, which made it basically free to make long-distance calls. And that’s when phone scams really fired up. Them free, untraceable phone calls from India and Nigeria and the Philippines, they started rolling into the United States, and they caught a lot of us unawares.
It’s still a problem, even now, with older folks who think that any long distance call must be super important. This is so ingrained in the elderly that it can be nearly impossible to get across the idea that any phone call you get these days can be from a number spoofed to look like it comes from your own city, and that scammers from overseas can be ruthless in trying to part you with your money. They’re not your friendly neighborhood banker or grocer or police officer who want everyone to have a great community, or are trying to build a reputation. They’re halfway across the world, and as far as they’re concerned, you are a piggy bank.
I talked about this phenomenon in Season 1, Episode 11 in an episode called Technology Peak: Why the Elderly Fall for Scams. People tend to get locked into certain perceptions about technology depending on when they experienced a big shift in their 20s or 30s, like the proliferation of the television or telephone or even the internet. If this theory is true, most of our elderly parents are still stuck in the 1970s and 80s, before there were phone scams.
In fact, the number of people scammed from voice cloning doubled between 2023 and 2024 as our parents and grandparents are aging. And you heard for yourself just how close a cloned voice can be to the real thing. The scammers also very wisely only give Grandma a little snippet before they switch the phone over to the sheriff, because the longer the voice clone talks, the more likely it will trip up on a word or inflection, or call Grandma “Nana” instead of “MeeMaw”. When you can only hear a few short words, that cloned voice can be completely convincing.
Another way cloned voices are used is for romance scams. The scammer, who is usually not a native English speaker, will avoid talking on the phone, and will instead send a recorded message to their victim. They’ll type in a customized message and have one of the standard synthesis voices say it, then they’ll send the recording via text. It’s something like,
MALE VOICE: Hi Mary. I love you so much, Mary. Don’t doubt that I’m real, because I am.
And any recorded message that says, “Don’t doubt me, I am real,” is 100% a scam. I mean, who says this? If you’re real, show up at my door with flowers or something. Don’t keep sending me this recorded crap.
Another trick of romance scammers is to find an existing video of the person whose identity they’re using, and replace the audio with their recorded AI clone audio. The lip movements aren’t going to match exactly with the audio, but these kinds of videos can easily fool someone who doesn’t know how this kind of thing works.
So if the voice is really convincing and you can’t spot the scam by spotting the voice as a fake, what can you do?
What’s needed here is a healthy dose of skepticism. For any alarming phone call that comes to you from a loved one, especially one that comes with an urgent demand for money, hang up and call the real person, or someone who knows them. The scammers will often say, “Don’t hang up, don’t hang up!” and even threaten you if you hang up, like, “If you hang up, little bobblehead will most certainly be the serial killer’s girlfriend by Monday.” They do this to scare you into not checking with someone else.
Don’t buy the urgency. The real police, or the electric company, or your bank, none of these real services will ever get upset if you say you’re going to call right back, of course after looking up the real phone number online. Anyone who says, “Don’t hang up!” after asking for money, is most certainly a scammer.
Another thing with romance scams: I saw a video on the YouTube channel Catfished a few days ago where the victim was fooled by a very natural-sounding cloned voice. The victim was sure it was real because the voice paused and said, “Uhhh” at one point.
FEMALE VOICE: Hi Tom, I’ve been thinking of you all day. I was thinking, uh, we can meet next month when I get back from my tour. Love you, baby.
The Catfished team was able to prove that all they had to do was type in “uhhh,” somewhere in the sentence, and the cloned voice said it a lot like a real person would.
So be skeptical of what you see and hear, and verify by other means. This also goes for sound bites from politicians, or endorsements from celebrities. Like, if you hear Jack Nicholson promoting some kind of off-brand Viagra, go check out some interviews with our ol’ buddy Jack to see if he really is plugging that stuff.
Not only do we now have AI-generated voices, we also have software for creating AI-generated videos of people. You give the software a photo, and the software makes the person look around, or nod their head, or even move their mouth like they’re saying some text that you fed into it. You just add the AI-generated voice, and you have a completely computer-generated video of a person talking.
There are lots of legitimate uses for this kind of video, like creating an avatar that gives customer support, or maybe translating audio into sign language, or bringing historical figures to life in a museum presentation. And sometimes people do it with photos of their deceased loved ones, and it makes them feel better. All these uses are just fine and they’re above-board. Nobody’s trying to pretend that it’s for realsies.
But unfortunately, scammers like AI-generated video, too.
Scammers are now using AI-generated video to provide convincing visual evidence of who they’re pretending to be, by making it talk in sync with their scammy AI-generated audio. So instead of a video they pulled off somebody else’s Instagram feed, where the lips are not going to sync up with their fake audio, now they have a video where the lips do sync up with the audio. So that video the scammer sends to their victim professing their love (and also insisting they’re real, of course) is that much more convincing.
The software for creating such videos is either free or cheap. Great for you and me, who just want to make a fun video from your parents’ wedding photo, but it’s also good for scammers, too.
How do we spot these fake AI-generated videos? Fortunately, there’s usually a few clues. Like, the head movements are kind of stiff and unnatural, like the head actually might not move much at all, it’s just the lips going “Raump, raump, raump.” And the teeth might be sliding around inside their mouth. In case you didn’t notice, your top teeth never slide. They’re glued to your skull.
Or the blinking is a little off somehow, it just looks creepy.
But if you aren’t used to looking for things like this, it can be really hard to spot. At that point, your only recourse is to verify with other sources, sources you trust.
An example is this ad I saw on YouTube where Jennifer Aniston, and Nicole Kidman, and a couple of other film actresses were sitting around a table, and they were apparently talking about some fitness or wellness product for women over 50, and talking about how this product is what helps them look the way they do.
But the video was kind of grainy and low-quality, the audio was a little weird and choppy, and it made my Spidey senses tingle a little.
So I looked for actual interviews with the actresses, and I found a bunch, but particularly some for Jennifer Aniston where she was on talk shows. The host interviewed her in front of a live audience. Pretty sure that video was real. And she never mentioned this product at all. I also found written interviews where she talks about her health regimen, which is working out and eating well. Never mentioned this product at all.
And shortly after that, a bunch of YouTube videos appeared that called out this video as a scam. Clearly, someone with scammy intentions did some messing around with AI to make that frankenstein of a video.
I’d like to think that the video didn’t fool anyone, but it probably did. But not you, no! Because you know what to look for, and you know how to verify what you’re seeing by consulting outside sources, right?
Another thing scammers use AI for is to help them write better scam texts and emails. It really wasn’t so long ago that you could spot a scam text or email a mile away because of the bad grammar and strange phrasing.
But now they can use AI text generators like ChatGPT to fix the poor English. And it works pretty well, at least some good percentage of the time. There are always going to be weird anomalies, like the overuse of the word “kindly” in messages from India, as in “kindly do the needful and send the funds now.” But I see this kind of language less and less as scammers get more clever about writing like native English speakers.
So with these messages, we just have to stay vigilant about the nature and the source of the message. A text from a stranger offering you a high-paying job? Scam. An email telling you that you won a sweepstakes, but need to pay a fee to collect the prize? Scam. It doesn’t matter how good the English is, the scams are still scams.
I’ve talked about these types of scams in previous episodes, and the same rules apply. Invoice scams, refund scams, sweepstakes scams, job scams, all still the same, just with better English.
Now with all this talk about the scammy uses for AI, don’t get me wrong, I think AI is great and has a lot of great uses. I’ve used it to get me past writer’s block, and to inspire me to start writing music again. There’s even AI for cybersecurity now that does the grunt work of analyzing network traffic to detect hackers. It’s been used to diagnose diseases. Lots of really positive uses there.
But where technology goes, so do the scammers. That’s just a fact of life. But if you know what AI is about, and you apply a healthy dose of skepticism and verification, you’ll be able to spot these scams despite the scammer’s best efforts to look and sound sincere.
I hope you enjoyed this little sojourn into the depths of AI-generated voices and videos and texts. In the next episode I’ll talk about the tools I used like Elevenlabs, and about the other tools that scammers use. We’ll play around with them, and you can play around with them, too. The more you know, the more prepared you will be.
I gotta admit, playing with AI voices, that was a lot of fun.
MALE VOICE: Michele, stop making me say things for your own entertainment. I feel exploited.
Oh, you do not. You’re fake!
MALE VOICE: No, I'm real! How can you doubt me? I thought we had a real connection.
Oh, stop your complaining, or I’ll make you read me the entire Harry Potter series.
MALE VOICE: I think I might actually enjoy that.
Looks like I’m going to be busy for a while! Shout out to Katie Haze Productions for producing this episode. See you next time on How Hacks Happen.