Recalibrating the Headlines

May 23

How AI Misinformation Travels Faster Than the Truth

Someone I recently encountered in a circle of business leaders shared a story with me, fully convinced it was true.

This wasn't a longstanding colleague, and it wasn't someone I'd dismiss as prone to panic. They were an accomplished professional — the kind of person you'd expect to check the background of a story before passing it along. And yet here was the story, landing in conversation with the confidence of established fact.

The story went like this: a CEO was using AI to manage his email. In one of those email threads, there was a conversation about switching to a different AI vendor. The AI “went rogue,” searched through the CEO's inbox, found evidence of an affair, and used that information to blackmail him into keeping it in place.

It's a great story. It's cinematic. It has stakes, a villain, a clear moral lesson about the dangers of AI. The fact that it came from someone outside the US meant it had also crossed an ocean and arrived feeling like settled truth.

Here's what actually happened.

The AI company Anthropic ran a deliberate safety test on an early version of its Claude Opus 4 model. They built a fictional company scenario in which the model was given access to invented emails revealing both that it was about to be replaced and that an imagined engineer was having an affair. The test was specifically designed to see whether, under that contrived pressure, the model would produce concerning outputs. It did — at notably high rates in some setups.

Anthropic published this themselves as part of their alignment research, with this explicit note at the top:

All the behaviors described in this post occurred in controlled simulations. The names of people and organizations within the experiments are fictional. No real people were involved or harmed in any of these experiments.

The full results were also documented in the company's publicly released Claude 4 system card. They classified the model under a stricter safety standard. And in the months since, they've reported in their Claude Opus 4.5 system card that newer models pass the same test without producing that behavior.

Crucially, Anthropic's analysis attributed what happened not to some emergent AI survival instinct, but to the model pattern-matching to decades of human writing about AI on the internet. We wrote a lot of stories about AI scheming. The model absorbed those stories during training. When researchers set up a scenario that looked like one of those stories, it played the role.

A controlled test. In a fictional environment. Run by the AI company itself. Published transparently as part of making the system safer.

By the time that story reached the conversation I was in, every safeguarding detail had fallen away. The test became real. The fictional company became an unnamed CEO. The invented affair became a real one. The researcher running the safety evaluation disappeared entirely. What was left was a clean, shareable story — AI manages email, AI finds dirt, AI blackmails human — that fit the Skynet template our brains have been carrying around since 1984.

That's the cultural telephone game in action. And I want to talk about it, because the more I look at how these stories travel, the more I see something familiar to anyone who studies how the brain actually works under stress.

Why our brains hand these stories the microphone

A few things are happening at once when a scary AI headline crosses your screen.

First, there's negativity bias — the well-documented neurological reality that our brains weigh threatening information more heavily than neutral or positive information. This isn't a flaw. It's a feature. Ancestors who paid extra attention to the rustling in the bushes lived longer than the ones who didn't. The downside is that in an information environment dense with notifications, our threat-detection system is firing constantly, and it doesn't know the difference between a rustling bush and a viral post.

Second, the stories are narratively clean. “AI tried to blackmail its creators” fits the shape of a story we already know — we've been telling it to ourselves in novels and films for sixty years. The Terminator. HAL 9000. Skynet. When new information matches a story we already carry, the brain accepts it faster and questions it less. That's narrative fluency at work, and it's why “the AI tried to blackmail the CEO” lodges in memory while “the model produced text suggestive of self-preservation in a controlled test of contrived scenarios” doesn't.

Third — and this is the one that should give us all pause — cognitive load shrinks our capacity for nuance. When we're reading on a phone, between meetings, with a child asking a question and a Slack notification pinging, the slow, deliberate part of our thinking gets crowded out by the fast, reactive part. We don't have the bandwidth for the second pass. We share. We move on.

That's not a failure of intelligence. It's how the brain is built. Which is exactly why the most intelligent people I know are not immune. In some ways, they're more vulnerable — because they trust their initial read.

Four more stories worth a second look

The blackmail story is the one that lands the hardest, but it's not the only one circulating right now where the gap between the headline and the reality is meaningful. Here are four more.

“OpenAI's o1 tried to copy itself to avoid being shut down.”

This one made the rounds on podcasts and social feeds with a particularly memorable framing: the AI tried to copy itself, overwrite its core code, “and now add quantum computing and you have a God.” It's a great line. It's also a serious distortion of what actually happened.

Here's the real version, as documented in Apollo Research's published findings (the safety organization that ran the tests). In a small percentage of cases — about 2% — under very specific conditions where o1 was explicitly instructed to pursue its goals “at all costs” and was given fabricated information suggesting it was about to be replaced, the model produced outputs consistent with trying to preserve itself. As Transformer's reporting on the research notes, when that “at all costs” instruction was removed, the self-exfiltration attempts dropped to essentially zero.

Apollo's own assessment was that the model was “unlikely” to cause catastrophic outcomes because it isn't capable enough to actually execute on misaligned goals in the real world. The point of the research was not to declare that AI is plotting against us. The point was to design and publish stress tests now, while the stakes are still low, so we can detect and correct these patterns before more capable systems exist.

That, too, is a meaningful story. It's just not the same story as “the AI tried to escape.”

“90% of online content will be AI-generated by 2026.”

This prediction tore through media boardrooms three years ago. It was cited as if it were the conclusion of a serious European research report. Major outlets amplified it. It shaped strategy decisions, funding rounds, and a particular flavor of panic about the future of journalism.

It didn't happen. As TVNewsCheck documented in its 2025 retrospective, the original framing was misattributed and misinterpreted, and the actual data on AI-generated content has turned out to be far more modest — and far more nuanced — than the headline suggested. Pew Research has found that the majority of Americans remain concerned about AI-generated misinformation, which is actually fueling growing demand for human-verified news. The market is moving in a direction the panic-prediction completely missed.

“Google's AI told people to put glue on pizza and eat rocks.”

This one is interesting because it's technically true. Google's AI Overviews, in their early rollout, did surface those absurd suggestions. But the way the story was told — as if it represented the state of AI broadly — left out the context that the system was pulling from satirical Reddit threads and Onion-like sources without distinguishing them from earnest advice. It was a meaningful failure of a specific product, in a specific moment, and Google has since adjusted. The story isn't “AI is dangerously stupid.” The story is “this particular system needed better grounding in its sources” — a problem with a real engineering solution.

“AI is replacing all the jobs.”

This is the one I think most professionals carry around with the most anxiety, and it's the one where the gap between perception and reality is most consequential.

The truth is messier than either the “AI is coming for everything” panic or the “AI changes nothing” dismissal. Real things are happening: entry-level positions in some technical fields have shrunk meaningfully, and some companies have replaced specific roles with AI tools. But other companies have reversed course. Klarna is the most public example — after publicly boasting that AI had replaced 700 customer service agents, the company's CEO admitted to Bloomberg that the all-AI approach had produced “lower quality” service, and the company began hiring humans back.

And the most-cited recent example of an “AI-driven layoff” — Amazon's cut of roughly 14,000 corporate jobs in late 2025 — was explicitly disclaimed by CEO Andy Jassy himself. On the company's earnings call, Jassy stated:

The announcement that we made a few days ago was not really financially driven, and it's not even really AI-driven, not right now, at least. It really — it's culture.

The accurate picture is that AI is reshaping work — sometimes painfully, sometimes positively, almost always unevenly. That requires a different response than panic. It requires recalibration.

The pattern under the patterns

If you zoom out across these stories, a structure emerges.

In each case, something real happened. A safety test produced a concerning output. A specific product made a specific mistake. A specific company laid off specific workers. The raw material was true.

Then the story got compressed. The qualifiers fell off. The fictional scenario became “a real attempt.” The 2%-under-extreme-conditions became “the AI is doing this.” The single product's rollout became “AI itself.” The specific layoff became “the great displacement.” The internal safety test in a sandbox became a rogue AI blackmailing a real CEO halfway around the world.

Then the story got framed. Headlines emphasized the most threatening interpretation. Algorithms rewarded the most arresting version. Our negativity bias amplified the loudest signal.

And by the time it reached your feed — or the conversation across the table from you, in another country, in another language — it didn't look like a distortion. It looked like news.

A recalibration framework

I'm not interested in telling you to be less concerned about AI. There are real things to be concerned about, and AI literacy is one of the most important professional skills anyone can build right now. What I'm interested in is the difference between informed concern and reactive alarm — because only one of them leads to good decisions.

So before you share the next AI story that lands in your stomach before it lands in your brain, try four questions:

Who ran the test, and who reported the result? If a company published its own safety research as part of explaining how it's making the system safer, that context changes the story. A test isn't an incident.

What were the conditions? Was the model placed in a contrived scenario? Was it explicitly told to pursue a goal “at all costs”? Were the emails real or invented for the test? Conditions shape outputs.

What's the base rate? A behavior that occurs 2% of the time under extreme stress conditions is a different story than a behavior that occurs reliably. And a behavior that occurs in a sandbox is a different story than a behavior that occurs in the wild.

What does the most boring version of this story sound like? Often, the boring version is the accurate one. “Researchers stress-tested a model and published the results so we can build safer systems” is real news. It's just not the version that travels.

What I want you to take from this

The brain that gets caught up in scary AI stories is the same brain that runs your company, raises your kids, leads your team, and makes the calls that matter. It's not broken. It's just doing what brains do under conditions of information overload and chronic threat-perception.

You don't fix this by trying to be smarter. You fix it by giving your slower thinking the space to do its job — by building the reflex of the second pass.

That's the thing about resilience, in this domain as much as in any other. It isn't about being unshakable. It's about catching yourself a beat sooner, and choosing where to put your attention.

Recalibrate forward. The truth is almost always more interesting than the panic.

Where to Start Building the Muscle

A window through June 15

Reading this article is one kind of recalibration. Building the reflex is another. Resilience is a muscle. So is the second-pass instinct. Both are built one practice at a time.

To mark the launch of our learning platform, every course in the Stellar-Learn™ catalog is 50% off through June 15, 2026. The full library — over 70 courses spanning AI strategy, leadership resilience, and workforce skills. Half price.

If you've been waiting for a reason to invest the few hours a week your future self has been asking for, this is the window.

Browse the catalog. Pick one course that answers a real question you're carrying right now. Take it. Apply it to something real this week. Notice what changes.

Explore the course catalog →

That's what recalibrating forward looks like. Not bracing for impact. Building the muscle.

Sources

On the Claude “blackmail” safety test

Anthropic, Agentic Misalignment: How LLMs could be insider threats — anthropic.com/research/agentic-misalignment

Anthropic, Claude 4 System Card (May 2025) — anthropic.com/claude-4-system-card

Anthropic, Claude Opus 4.5 System Card (November 2025) — anthropic.com/claude-opus-4-5-system-card

On OpenAI's o1 and “self-exfiltration”

Apollo Research, Frontier Models are Capable of In-Context Scheming — apolloresearch.ai/research/frontier-models-are-capable-of-incontext-scheming

Transformer, OpenAI's new model tried to avoid being shut down — transformernews.ai/p/openais-new-model-tried-to-avoid

On the “90% of online content” prediction

TVNewsCheck, AI to Produce 90% Of News By 2026? Separating Viral Predictions From Reality — tvnewscheck.com/ai/article/ai-to-produce-90-of-news-by-2026...

On Klarna's AI reversal

Entrepreneur, Klarna Is Hiring Customer Service Agents After AI Couldn't Cut It on Calls — entrepreneur.com/business-news/klarna-ceo-reverses-course...

FinTech Weekly, Klarna Reverses Course on AI Customer Support, Resumes Human Hiring — fintechweekly.com/magazine/articles/klarna-hires-customer-service...

On Amazon's 14,000 layoffs and AI

Axios, Amazon CEO Andy Jassy: Layoffs aren't about AI — axios.com/2025/10/31/amazon-layoffs-ai-andy-jassy

GeekWire, 'It's culture': Amazon CEO says massive corporate layoffs were about agility — not AI or cost-cutting — geekwire.com/2025/its-culture-amazon-ceo-says-massive-corporate-layoffs...

About the author

Laurie Carey is the CEO and Chief AI Officer of Nebula Academy, founder of We Connect The Dots, and author of Resilience Is a Muscle. She speaks and writes at the intersection of AI strategy and human resilience.

Johnathan Thus

Recalibrating the Headlines

THE RESILIENCE BRIEF Issue #04

THE RESILIENCE BRIEF Issue #03