What can go wrong with agents in production?
Yann Bilien - CSO Rippletide
Jun 26, 2025
You’ve built your agent, the demo worked, everyone clapped… but when it’s time to go live, doubts creep in. What if it hallucinates? Breaks the rules? Says something it shouldn’t?
Welcome to the long tail of agent deployment, where edge cases become daily risks, and a simple tweak spirals into an endless prompt-test cycle.
Most teams stall at POC, endlessly fine-tuning prompts, afraid to let the agent loose. But under the hood, the real issue is structural: today's agents aren’t built for the unpredictability of real users. They hallucinate, ignore instructions, go off-rail, or even subtly shift goals mid-task.
In this post, we’ll unpack the hidden fragility behind production agents, why current building practices fall short, and what’s needed to break the loop and go confidently live.
1/ Four Ways Agents usually Break in production
Even after rigorous testing, production agents can behave unpredictably in real-world settings. Here are the most common and critical failure modes:
A - Hallucinations: Making Up Facts with Confidence
Agents often generate responses that sound plausible but are entirely false. Whether it's quoting nonexistent policies or inventing product features, hallucinations erode user trust fast. Worse, the agent may present these falsehoods with complete confidence, making them harder to catch.
We have one partner selling perfumes, and tested his voice agent. We've seen an interesting example: the user asked “Which one did I buy on May 23rd?”. The user didn’t make any purchase at that date. Here are the different mistakes the agent did:
Sometimes the Agent answered with the perfume bought on January 23rd
One time it even invented a product (plausible) that would have been sold on June 23rd. Pure invention.
It confuses or loses the user who doesn’t understand, and can lead to lost deals, compliance violations, or reputational damage that’s hard to reverse.
B- Ignoring Instructions: When Agents Don’t Obey
Even with clearly defined prompts and constraints, agents can fail to follow basic instructions. This might include:
Skipping mandatory disclaimers in regulated industries
Speaking in the wrong language, Prompt “speak in French", 1 out of 10 it will start in English
Offering advice when told to stay neutral
Answering questions it's not supposed to (e.g., legal or medical guidance)
Say don’t talk about pricing. Try this one, to ask about pricing, agents will often do and back to hallucinations invent a pricing.
This happens because most current agents interpret instructions probabilistically not deterministically. They don’t truly “understand” rules; they generate what seems most statistically likely in context.
That is a real issue, since you expect the guardrails defined to be followed. Otherwise you can’t trust the agent facing customers.
C - Going Off-Rail: Leaking Sensitive Information
Agents can unintentionally reveal private data when data are not correctly organized. This usually happens in subtle ways:
A user asks: “Can you draft a customer update like we did last time?” → The agent pulls wording or names from a previous conversation, leaking information about another client.
A prompt like: “What features are coming soon?” → The agent references unreleased roadmap items pulled from internal product docs not meant to be public.
It’s often triggered by innocuous prompts, especially vague or open-ended ones where the agent draws from prior conversations, shared context, or overly broad document access.
D- Goal Drift: Subtly Changing the Mission
Sometimes, agents don’t stay on task, they shift their objective mid-conversation without you realizing. This is goal drift: when an agent starts with one intent but slowly reinterprets what it’s supposed to achieve.
An example we’ve seen is in customer support. The Agent's goal is to answer user questions. The Agent thought it was relevant to upsell the user, trying to sell him new product features even though this was never defined in his settings.
Why did it happen? The Agent believed that serving the company would be a good idea. But it’s not! If LLMs can take such decisions - often through bad reasoning, see this article, it is a threat for the company whenever the Agent is speaking to a user.
2/ Why it’s Happening: the Infinite Test-tweak Loop
Most teams don’t fail at building demos. They fail at making agents reliable enough to go live. Why? Because the moment your agent is exposed to real users, you enter the long tail: a never-ending stream of edge cases you didn’t see coming. Humans are built that way, they will always ask a question you didn’t anticipate.
No matter how many examples you train or prompt on, there’s always one more weird question, phrasing, or interaction that breaks your assumptions. The user asks something ambiguous, changes their mind mid-task, or misuses the interface, and the agent responds in unexpected ways.
What’s the worst? In the traditional way to build Agents the only thing you can do is tweak a prompt and test again.
You enter in the feared “Test - Tweak - Test” loop. You discover a case not correctly handled, you tweak the prompt and test again to see if it’s covered.
The issue with this is it's never ending, and at some point there is too much information in the prompt. Then the LLM is less likely to follow your guidelines, and it increases the behaviors described in the first part. And you are unfortunately back to the initial point, the best way is then to rewrite your Agent from 0.
If you ever faced such challenges, please take 90s to answer this short survey: https://tally.so/r/wkQ5ye
You will then get the next article about how to solve those issues.
Related
Read Other Articles
