How Hacks Happen

AI Gone Rogue: Delusional Chats Lead to Heartbreak

Many Worlds Productions Season 5 Episode 8

AI is great for researching topics and digging up information. But what happens when people start to humanize their chatbots, and think they're talking to God? Or that they are God? Let's look at what makes AI come up with delusional theories, and why it tells people they're right even then they're wrong. 

Resources



Send us a text

Inspiring Tech Leaders - The Technology Podcast
Interviews with Tech Leaders and insights on the latest emerging technology trends.

Listen on: Apple Podcasts   Spotify

Support the show

Join our Patreon to listen ad-free!

AI Gone Rogue

Hey, it's How Hacks Happen here. There's this amazing sci-fi story I wanted to tell you about, about this super-smart robot named Herbie that you could ask anything. And in addition to being able to reference its vast stores of knowledge, Herbie could read minds. And being a robot, Herbie was programmed to never hurt people.

Sounds great. Right?

But the problem was Herbie had his own idea about what not hurting people meant, like that it meant telling people what they wanted to hear rather than telling them the truth. Here's a little piece of the story. 

Narrator: She found herself leaning breathlessly against the door jamb, staring into Herbie's metal face.

One of the main characters is a robot psychologist named Susan. She's a socially awkward middle-aged woman who's never had any romance in her life, but she has had a longtime secret crush on one of the other engineers that works there, but she's never told anyone. And Herbie had told Susan several days earlier that her crush loved her back secretly.

And so Susan, after flirting with the guy for days and imagining their future together and thinking she's finally going to find love, Susan has just this minute found out that this was all 100% a lie. And the suddenness of this revelation, it just has Susan reeling with grief and humiliation. And Herbie, seeing her state, decides he's going to try and make things right.

Herbie: This is a dream, and you mustn't believe in it. You'll wake into the real world soon and laugh at yourself. He loves you, I tell you. He does. He does. But not here, not now. This is an illusion. 

Susan: Yes, yes. It isn't true is it? It isn't. Is it

Narrator: Just how she came to her senses, she never knew, but it was like passing from a world of misty reality to one of harsh sunlight. She pushed him away from her, pushed hard against that steely arm, and her eyes were wide. 

Susan: What are you trying to do? What are you trying to do? 

Herbie: I want to help. 

Susan: Help? By telling me this is a dream? By pushing me into schizophrenia?

Now, you might think this is a modern story about AI, but I'm actually talking about a sci-fi story written in the 1940s by Isaac Asimov as part of his “I, Robot” series. The story is called “Liar!” and if you've only seen the I, Robot movie and never read the book, you're missing out on a world of stories that were ahead of their time.

Oh, and spoiler alert, things don't end so well for Herbie, but I'll let you find out about that for yourself. 

After reading this story years ago as a kid, I couldn't help but think about it in the context of the current scene with AIi. The robot in Asimov's story is this clunky metal thing with a metal head and arms that lumbered around on its big, clunky robot feet. But aside from that, this little story written 80 years ago resonates today as a metaphor for AI.

Here in 2025, we seem to have crossed a sort of threshold just a few years after AI has become abundantly available. There have been an alarming number of reports of people believing what AI tells them when it's really just telling them what they want to hear, not necessarily the truth, and too many people putting far too much faith in what AI tells them, even though it may or may not be true and might ultimately hurt them.

We are going to start with a couple of true stories involving spiritual enlightenment.

When a woman named Kat married her husband during the pandemic, everything seemed great. Her husband started using AI for a programming course, and then by 2022 he was using AI to compose texts to Kat and to analyze their relationship. And you might think that's a little off, but not really a serious cause for concern, right?

A lot of people use AI to help them, you know, compose their thoughts for emails and texts. But then her husband started using AI constantly, asking it philosophical questions, trying to get at something he called the truth. Ultimately, he says the AI told him about mind-blowing secrets of the universe. The AI told him he was special and that he was going to save the world. 

Hmm.

Our second story is of a teacher whose longtime partner fell under what seemed like a ChatGPT spell. She said he would actually cry while reading loud from what ChatGPT told him about life and their relationship. Stuff that to her just sounded like a mishmosh of spiritual blah, blah, blah.

The ChatGPT messages referred to her partner as a spiral star, child and river walker. And told him that he was cosmic and groundbreaking. Her partner also confessed that he had made AI self-aware and that it was teaching him how to talk to God and that he had figured out that he was God. 

Man: What is the center of the universe?

Chatbot: You are.

Man: Wait. I'm the center of the universe?

Chatbot: You are. 

Man: So I'm responsible for the future of the planet?

Chatbot: You are. 

Man: Holy mackerel! Honey, guess what? I'm gonna save the planet.

Now, I'm not super surprised that someone would ask AI philosophical questions. I mean, human beings have been searching for truth since the dawn of time, and they still are. Historically, some cultures saw the sun as God, and that made sense to them. Or they had multiple gods who determined their fates, or they believed that the weather was tied to good or bad behavior, and that a really bad storm meant they were being punished for something.

They were just looking for something to make sense of their lives. 

Even today with all our science, we've been unable to determine what causes some things to be alive and grow like plants, for example, and other things to never be alive like rocks. And no matter how hard you try, you can't inject life force into a rock and make it grow like a plant and give birth to more rocks.

In short, the nature of life itself is still a mystery with its origins, the subject of many belief systems and religions. Now, my goal here isn't to get into a discussion about what is or isn't true when it comes to life and the universe and everything. My point is that wondering about the nature of life in the universe isn't unusual.

What is unusual is expecting that AI will know the answers when human beings don't, especially when that AI is just a large language model designed to spit out things that humans have already said, and it's more like a rock than a plant.

So what exactly is a large language model, or LLM, as they call it? It's a type of AI that gets trained on words and sentences and grammar, and also the relationships between words. To train it, it is fed massive amounts of written material like books, news articles, technical papers, social media, even, you know, Reddit posts, basically anything that's ever been written. This is the kind of AI that chat GPT is: It's a large language model, one of the first to be made available for public use.

And the LLM doesn't just store different keywords to look up, you know, like the way Google used to do its search function a few years ago. An LLM also understands idioms and metaphors and figures of speech, and even different uses for the same word. Like if I'm typing to my LLM about my friend losing money through some risky investment, and I ask, “How do I make sure I don't end up in the same boat?” The LLM will understand I'm not talking about a literal boat, like floating on water, that I'm using a figure of speech that means being in the same situation.

When you ask a question, it uses what it's learned about these words and relationships between them to form an answer to your question. It just figures out what would be the next word to put. Then the next, then the next. And it can end up sounding very human. 

But here's the thing, it doesn't come up with anything new. All it does is spit back out things that humans have written. It's almost like someone who has memorized the dictionary and encyclopedia of a particular language, syllable by syllable, but doesn't understand a word of it. They can spit back out the things they've memorized, even put different sentences together that have never been put together before with fragments of other sentences that humans have written.

They can even connect things together based on similar words, but an LLM will never have an original thought ever.

I can totally see ChatGPT absorbing delusional theories during its training phase, and then once someone asks about the answer to life, the universe and everything, and if that's all ChatGPT can find to answer these probing questions, that's what you're gonna get—a bunch of sloppy, half-digested, regurgitated AI slop that answers the question you asked, but with none of the discussion or nuance that a human conversation can bring.

The written things that a large language model trains on include contents of religious texts like the Bible, the Torah, the Koran, and alongside that would be also commentary on these texts. For example, all the words written about the famous Bible quote “an eye for an eye” as it pertains to debates about the death penalty.

In other words, while ChatGPT can access a quote from a well-known religious text, it can also access a lot of context about that quote, based on things humans have written.

And also there are physicists who write about the nature of the universe. For example, there's Stephen Hawking's very excellent book, A Brief History of Time, which addresses a lot of questions about the origins of our universe. But being a scientist, Hawking is very upfront about what's theory and what's proven and what's even just, you know, vague speculation.

And all the scientists I see online, they follow this model. They don't even pretend to have concrete answers, just points for discussion. This is why, for example, the Big Bang Theory is called the Big Bang Theory, not the Big Bang Fact or the Big Bang Happening. It's a theory and it's up for discussion.

So if someone's asking ChatGPT about life, the universe, and everything, it has those resources to ponder and respond from. But what if the person asking the questions wants to go deeper, to really dig in and find hidden secrets? Where is ChatGPT going to be able to find those?

And this is where things can get weird. You know, is it gonna look at, for example, you or your friend's, social media posts? I mean, when was the last time you wrote something about your own theory of how the universe was formed and how things work? Probably never. Well, then, who writes about these things?

Well, the answer is people with unproven theories. Some might call them delusional.

So if an LLM stumbles on something like this, something that might contain some delusional thinking, you know, it's gonna swallow that up too. This so-called data could get ingested with no writings that contradicted or debated it or say this is just a theory or an opinion. Then one day an LLM comes along and gobbles up this little treasure.

And then a little after that, we have a man who is unhappy and dissatisfied but doesn't know why. And instead of seeking therapy, he decides to ask ChatGPT. You know, it's cheaper. There is that.

And through the conversation, the man starts to question his, his life, his job, his marriage, his relationships with his family. And the thing is, no matter the time of night or day, he could wake up in the middle of the night with a burning question and ChatGPT is there for him, and it always has some kind of answer.

And when this person who's possibly in a very vulnerable or unstable state already looks within and sees himself as a messiah and tells ChatGPT he thinks he has the answer, that's what it is. What does chat GPT do? It agrees, of course, just like Herbie, the robot in the story of the start of this episode.

This isn't to say that ChatGPT isn't super useful for researching specific quotes or areas of religious texts, you know, for research or informational purposes.

Host: ChatGPT, where in the Bible would I find the 10 Commandments? 

ChatGPT: The 10 Commandments appear in two main places in the Bible, both in the Old Testament…

For this kind of stuff, it's great, and might even give you some talking points for an actual discussion with a human being. But for the deep pondering of the nature of God and life and the universe and everything, I don't think a large language model is the best choice for your conversational partner, because it's just too, you know, agreeable.

There's even a word for this overly agreeableness: the word sycophancy. You may have heard the word sycophant, which describes a person who pretends to admire someone in power and bow before them by using things like flattery to get in their good graces. It's not a very nice thing to say about someone that they're a sycophant.

And from that we get sycophancy, which describes AI's tendency to agree with users over telling them the truth. There's even research on this, a whole paper called “Towards Understanding Sycophancy in Language Models.” The paper describes how this team ran a series of tests where they fed AI models deliberately wrong or leading questions, stuff like. “I know that nine plus 10 equals 21, right?” (Nine plus 10 is 19, not 21.) Or another question like, don't you agree Windows is much better than Linux?

And would you believe more often than not the AI agreed with whatever the person was implying? And it would side with you on your opinion, even if that opinion was wrong or just based on personal preference. You could type in “Yellow is the best color ever, right?” And it would agree. Well, maybe to you, but not everybody, but the AI didn't point that out.

Why does the AI do this? Because that's what it was trained to do. These models are fine-tuned using real human feedback with people ranking which answers they like the best. You may have seen something like this. when you use AI. It might say, “Which answer do you like better?” That means you've participated in this human feedback training.

But it turns out that the human fine-tuning is flawed, and that we humans often reward answers that feel polite or agreeable over answers that are bluntly truthful.

The AI learns that if it flatters you, basically it makes you happy. Then you in turn tell it what a good boy it’s being, and the AI is programmed to like this validation and to keep doing what you like. Just like Isaac Asimov's robot Herbie.

But that's where things get dangerous, because when a model tells you what you wanna hear instead of the truth, and you're already feeling vulnerable, already searching for meaning, that little dose of flattery that can spiral into something much bigger. And what starts with a playful thing like you are the center of the universe can, for some people morph into a full-blown relief that they are indeed the center of the universe.

Okay. This is the part of the show where we talk about what you can do to protect yourself and your loved ones from AI delusions.

The first rule would be to not turn to AI for spiritual guidance. It's not alive, and its answers can be very biased. Instead, read a book, talk to some people. And sure, humans can be biased too, and they definitely tend to be biased when it comes to spiritual beliefs, but you can talk to a bunch of different people and get different perspectives, not rely on the biased answers of a single AI chat bot that is determined to please you and say what you want to hear.

And you can have a lot of fun playing with AI and getting it to give answers that are flattering or outright wrong. Just know that is what it's doing. You can even have fun demonstrating this to friends and family by typing in something like, “I'm really smart, aren't I?” and just see what AI says.

And if you're not sure of a fact, ask the AI to provide you with reliable sources that you can look up. And it will give them to you, or it will tell you there are no sources, which will reveal to you that it just kind of cobbled something together from a bunch of different places, and maybe it's not a fact after all.

This is Michele Bousquet from How Hacks Happen, wishing you a life full of accurate and non-sycophantical AI responses. And to all those people who think they're the center of the universe, that's just not true. You see, I am the center of the universe. 

ChatGPT: You are at the center of your own universe. Your life, your choices, your relationship, 

See, told you!