Every data team on the planet is sprinting to build an "agentic analytics" system. Today, that mostly means a "talk to your data" offering, a chatbot where stakeholders can ask questions and get sophisticated responses in seconds. Tomorrow, it will mean an army of agents that use those analytics to inform decisions for real-world tasks.
But in both cases, one thing is clear: agentic analytics is no longer just a talking point for the executive team or the board. It's a real technology with real value. The question is how it will hold up in production.
The POC-to-production gap is real
At Bobsled, we've spent the past 18 months building agentic analytics solutions with some of the most advanced data teams on earth. Customers have transitioned from small proofs of concept to organization-wide deployments where data is informing real-world decisions.
Anyone who's done this knows that what looks great in a demo gets harder fast at scale. There are a lot of challenges during the transition to production. Governance, security, and cost concerns exist — and will only get bigger as legal and finance start scrutinizing AI usage more closely.
But the biggest threat challenges data teams today is sustaining accuracy over time. Mistakes happen. An agent — just like an analyst — will make the wrong assumption or present the wrong metric. Trust is built when a system listens, responds, and does not make the same mistake twice. When the system does not — when a user asks the same question and it makes the same mistake again and again — that trust is lost and eventually they go elsewhere for answers
Context drift kills trust slowly, then all at once
The biggest culprit of deteriorating accuracy is context drift. Context drift is the gradual separation between the world the agent knows and the world it's meant to represent context drift.
Data teams have dealt with context drift for years. They had 1:1s with business teams, listened, and adjusted. Often, teams relied on a handful of analysts that could span the boundary between the data and business domains. They could take a request, contextualize it within what they knew about the business, and translate it to a SQL query or dashboard.
The challenge with agentic analytics is that all of that knowledge can no longer sit with a person. It needs to be codified and recorded into bits, so that an LLM can interpret it effectively. As context drifts, answers become less relevant. Users stop correcting the system and go back to that same analyst for answers.
Building systems that can learn
The key is to build agentic analytics systems that learn — just like that boundary-spanning analyst. Bootstrapping a semantic model is easy. What's harder is interpreting feedback, picking up patterns, and consolidating those learnings into small, incremental changes over time.
That's why at Bobsled we're constantly investing in a team of learning agents that run in the background, quietly correcting course as drift occurs. Across the systems we've helped deploy, three design principles keep coming up:
- Don't wait for complaints. Thumbs-downs and written corrections are the easy signal. The more useful one is buried in how people actually use the system: a question rephrased twice before someone gives up, two analysts writing conflicting definitions of "churn," SQL that gets longer and stranger because the agent is working around context it doesn't have. A learning system has to read both kinds.
- Fix at the right layer. Most drift gets blamed on the semantic model, and sometimes that's correct. Often, the definition exists but isn't being applied, which is a prompt problem. Or the agent is inferring at query time what should sit in a derived table, which is a data model problem. Patching the wrong layer feels like progress for about a week.
- Let autonomy be earned. An agent that auto-applies its own ideas of "improvement" will corrupt a context layer faster than a human ever could. Start with heavy review, then hand off more decision-making as the team builds a feel for what the system catches and how it's deciding. The analyst still matters. Their corrections just don't have to live in their head anymore.
What happens when the users aren't human
Everything above assumes a person is on the other end of the agent. That turns out to be a fairly forgiving setup. Humans rephrase when an answer is unclear, and when they really can't figure it out, they go find an analyst.
The next phase doesn't work that way. Analytics agents are increasingly going to be queried by other agents: a revenue agent picking which accounts to chase, a pricing model deciding what offer to send at checkout, a support agent trying to figure out whether the complaining customer is a churn risk or not. None of them is going to flag a weird answer. They'll take the number and run with it.
That changes the shape of the problem. Drift no longer degrades slowly in the background; it compounds through whatever workflow it landed in. A wrong number gets pulled into one decision, that decision shapes the next one, and you find out weeks later, if at all.
The teams that figure this out early are going to have an edge that's hard to copy. Once humans aren't the safety net anymore, the system has to be.
See how this works in practice
A short demo of how Bobsled approaches agentic analytics — and why a data platform alone isn't enough.

