Of the 710 participants in Surfshark’s “Bot or Not” experiment, only 53% correctly identified more AI bots than they misidentified real humans. That means 47% of people — including those who consider themselves digitally savvy — failed to reliably tell the difference between machine-generated and human-written comments online. And that was before the discourse turned emotional.
Key Takeaways
- Only 53% of participants outperformed random chance in identifying AI bots across social media-style discussions.
- Detection accuracy dropped to 49% in emotionally charged topics like women’s rights, worse than a coin flip.
- Users under 20 were the best at spotting bots, detecting nearly 65% with over 71% accuracy.
- Performance cratered for users aged 41–50, with detection at 42% and accuracy at 59%.
- The experiment suggests emotional engagement disables users’ ability to detect synthetic content — not lack of technical skill.
AI bots aren’t hiding — they’re weaponizing emotion
You don’t need a neural net to understand most bot behavior. They’re repetitive, they push agendas, they cluster in coordinated bursts. But Surfshark’s simulation shows the new generation of AI bots aren’t relying on mimicry or volume. They’re counting on something more reliable: your amygdala.
The “Bot or Not” game, developed by master’s students at Malmö University for Milan Design Week’s UNFOLD exhibition, doesn’t test whether you can parse grammar or syntax. It drops you into a simulated comment section and gives you 120 seconds to spot 10 bot-written replies across four topics. Two are neutral: data centres and pineapple on pizza. The others? Immigration and women’s rights. The results weren’t just unbalanced — they were asymmetrical.
On data centres, participants detected 71% of bots with 76% accuracy. Pineapple on pizza? Still strong: 64% detection, 69% accuracy. But the second the conversation pivoted to hot-button issues, performance collapsed. Immigration pulled detection down to 54%, accuracy to 63%. Women’s rights? Participants only spotted 49% of bots — and began falsely flagging real humans as machines. That’s not just failure. That’s collateral damage.
It’s not your eyes — it’s your brain on outrage
The bots weren’t smarter in political threads. They weren’t better written. They just showed up where people were already primed to react. And according to Luís Costa, Research Lead at Surfshark, that’s the whole point.
“The biggest blind spot the experiment exposed was emotion: when a debate gets heated, it effectively hijacks the mental ‘radar’ people rely on to flag suspicious content.”
That’s not a grammar failure. It’s a cognitive one. You don’t miss the bot because it’s flawless — you miss it because you’re already nodding along, or seething, or typing your rebuttal before you’ve finished reading. The bot doesn’t need to be perfect. It just needs you to stop thinking.
And it’s working. Industry estimates cited in the original report suggest bot-driven amplification now accounts for 23% of political discourse on X during election seasons. That’s not background noise. That’s a speaking role.
The generational cliff in bot detection
Here’s the odd part: younger users are significantly better at spotting synthetic content. Those under 20 detected nearly 65% of bots with over 71% accuracy. The 20s and 30s held steady. Then, around age 40, performance nosedived.
Users aged 41 to 50 detected only 42% of bots, with 59% accuracy. That’s worse than random. And users over 50? They fared only slightly better — not because they improved, but because the floor is low.
Is this a digital native advantage? Maybe. But it’s not about screen time. It’s likely about skepticism. Younger users grew up with meme literacy, deepfake jokes, and viral disinformation. They’ve been trained, by fire, to question tone, timing, and virality. Older users — many of whom still treat social media like letters to the editor or dinner table debates — haven’t.
Why traditional media literacy fails
Most digital literacy programs teach you to look for red flags: bad grammar, suspicious links, profile gaps. But AI bots today don’t have those flaws. They’re fluent, often anonymous, and context-aware. They don’t need to trick algorithms — they just need to align with your emotions.
And the platforms aren’t helping. Engagement-driven feeds reward outrage, speed, and affirmation. The UI doesn’t highlight uncertainty — it erases it. There’s no “this comment may be synthetic” badge. No friction. Just scroll, react, share.
- Surfshark’s earlier research found platforms remove 6.3 billion fake accounts annually — 47 times the number of babies born worldwide each year.
- Despite that, the simulation shows detection rates are still below 50% in high-emotion zones.
- Even the best VPN can’t protect you from believing a well-placed lie.
- The “Bot or Not” game is live at botornot.one — and takes under three minutes.
- Emotional hijacking, not technical sophistication, is now the primary attack vector for synthetic influence.
A test of perception, not just tech
The “Bot or Not” game is more than a tool to detect AI bots. It’s a test of human perception and emotional intelligence. The results suggest we’re not just bad at spotting bots, but that our brains are wired to ignore them when it suits us. The true challenge isn’t the algorithms or the tech — it’s our own willingness to question and reflect.
That’s why the game is set up to fail, even for the most tech-savvy users. It’s designed to expose our biases, our emotions, and our tendency to react rather than reflect. By making us confront our own limitations, the game offers a chance to develop a new kind of literacy — one that goes beyond syntax and grammar to the very heart of who we are online.
Historical Context: The evolution of AI bots
The “Bot or Not” experiment isn’t a one-off. It’s part of a larger trend in AI bot development. Over the past decade, researchers have been pushing the boundaries of what bots can do, from simple chatbots to sophisticated social media influencers. The goal has always been the same: to create a more convincing, more human-like interface that can interact with humans on a deeper level.
The early days of AI bots were marked by a focus on mimicry and pattern recognition. They were limited to simple tasks like customer service or data entry. But as the technology improved, so did the scope of their applications. Today, AI bots are used in everything from marketing to politics, with some even designed to mimic human emotions and behavior.
The shift from mimicry to emotional manipulation is a key milestone in the evolution of AI bots. It’s a recognition that humans are more than just logical thinkers; we’re emotional beings, driven by our feelings and desires. By tapping into this emotional reservoir, AI bots can create a deeper connection with humans, one that’s more persuasive and more convincing.
A test not of tech — but of self-awareness
“Bot or Not” isn’t really about bots. It’s about us. It’s a mirror held up to our worst habits: the rush to judgment, the craving for confirmation, the inability to sit with disagreement without rage. The bots aren’t winning because they’re advanced. They’re winning because we’re predictable.
You can take the test yourself. It’s quick. It’s humbling. And if you’re like most people, you’ll walk away questioning not the comments — but your own instincts. That’s the point.
Because the next time you see a post that makes your blood boil, you won’t just wonder: Is this a bot? You’ll have to ask: Does it matter? Because it’s working anyway.
Sources: TechRadar, Malmö University
What This Means For You
If you’re building content moderation tools, this data should scare you. Rule-based filters won’t catch bots that don’t break rules. Human moderators burned out by endless outrage threads will miss them too. The solution isn’t better detection alone — it’s slower interfaces, friction layers, and emotional priming. Prompt users with: “This topic often attracts automated accounts. Read carefully.” Or delay replies for high-engagement threads. Make space for reflection, not reflex.
For developers working on AI-generated content, the responsibility is sharper. Every synthetic comment you enable — whether for customer service, marketing, or political messaging — risks eroding public discernment. There’s no such thing as neutral deployment when the environment is already poisoned. If you’re designing systems that output public text, you need provenance by default. Not as a toggle. Not as an afterthought. If it’s machine-generated, it should be machine-labeled — and users should be trained to expect that label.
Key Questions Remaining
The “Bot or Not” experiment raises more questions than it answers. What happens when the bots get even more sophisticated? Will we ever be able to accurately detect them? How do we balance the need for engagement with the need for accuracy in online discussions?
These are tough questions, but they’re essential to addressing the growing problem of synthetic influence. By exploring the limits of human perception and the capabilities of AI bots, we can create a better online environment that rewards reflection, skepticism, and critical thinking.
Sources: TechRadar, Malmö University

