
September 2023 Releases
A modern AI agent can sound fluent and still underperform in the ways that matter: conversations stall, customers get frustrated, and transfers to humans stay stubbornly high.
This usually isn’t a speech problem. It’s a context problem. The agent doesn’t have the right knowledge about what customers are actually asking in real conversations. Teams will often respond by adding more knowledge articles or tweaking instructions, but without a clear way to validate whether that content and guidance truly maps to customer language, improvement becomes guesswork.
The result is a familiar pattern: you have documented policies, help-center pages, and internal guidance…but customers phrase problems differently, combine multiple issues in one request, or ask for outcomes (“Can you fix this today?”) rather than categories (“Billing policy”). When that mismatch happens, your agent may stall, deflect, or transfer, not because it lacks “intelligence,” but because it lacks coverage.
Regal Improve makes that mismatch visible and actionable: it shows whether your AI agent has the right information and instructions to handle real customer requests.
To ensure your AI agent can accurately and reliably answer customer questions and maximize containment, you need to answer one core question:
Does the agent have relevant coverage for the topics customers are actually raising?
Importantly, “coverage” here is not limited to “did we upload documents?” Coverage can come from multiple resources the agent draws on, including:
That distinction matters because knowledge and instructions fail in different ways:
The answer to our core question must capture alignment: how well real customer language matches the knowledge and guidance available to the agent, then pairs that coverage signal with operational outcomes like talk time, transfer rate, and sentiment so teams can prioritize what actually matters.
Looking at one conversation at a time is rarely actionable at scale. What teams need is a way to see patterns: the recurring “moments” where customers ask for similar things, and where the agent repeatedly succeeds, or repeatedly struggles.
.png)
Regal Improve analyzes real customer utterances from transcripts, then groups them with other semantically similar customer moments to extract recurring topics and “themes” that show up across thousands of calls. These topics are derived via LLM from what customers actually say, rather than being hand-defined categories, so they reflect the real distribution of customer intent, not just what a team assumes customers care about.
The shift from isolated calls to recurring topics, changes how improvement work happens:
At the topic level, you can compare “what customers ask” to “what the agent can draw from” across your most common topic areas.
Coverage is measured using meaning-based matching rather than exact keyword overlap.
That’s a crucial point for real customer conversations: two pieces of text can describe the same intent using very different wording. Modern semantic similarity approaches are designed to represent text so that “similar meaning” ends up “close together,” enabling efficient similarity search even when the words don’t match exactly.
Conceptually, Regal Improve compares a customer moment to the agent’s available resources and asks:
Mathematically, coverage is expressed as a similarity score on a 0–1 scale (higher means a stronger match in meaning).
To keep interpretation practical, the dashboard uses intuitive bands and text labels:
These thresholds are useful for triage, not absolute judgment. A low score does not automatically mean the agent failed, and a high score does not guarantee the customer had a good experience. The goal is to surface topics worth investigating, then validate them with evidence.
Coverage gaps can come from different sources, and Regal Improve distinguishes where coverage comes from:
.png)
This matters in practice because you’ll often see this asymmetry:
Once you prioritize a high-impact topic with poor coverage, the problem becomes operational: what’s missing, where it shows up, and how to address it.
Take a topic like “Requesting Delivery Locations” It appears frequently, has low coverage, and a high transfer rate. That combination tells you this request is common, the agent does not have sufficient context to handle it, and the gap is driving real operational cost.

From there, the next step is to determine the failure mode. Looking at conversations within that topic, you can trace how each request maps to the agent’s available context.
In practice, the gap typically falls into one of three categories:
These failure modes often look identical at a surface level. They all appear as unresolved conversations, long talk times, or escalations. But they reflect different underlying issues, and require different interventions.
Once the failure mode is clear, the work shifts from diagnosis to intervention. Start by focusing on topics that combine high volume with poor outcomes, then validate the issue against real conversations before making changes.
Take "Requesting Delivery Locations" as an example. It appears frequently, has low coverage, and a high transfer rate. When you look at the actual transcripts, you see that customers aren't asking whether the moving service delivers to a given area. They're asking something more specific: can the movers make two stops instead of one? They're moving out of their apartment but need the couch to a self-storage unit across town and the rest to their new place.
That's the goal of this step: isolate what is actually breaking:
That distinction determines the fix. The knowledge base accurately describes where the service operates, but it doesn't account for what customers are actually asking. The agent answers the wrong question well, and the customer transfers anyway.
Missing context requires new or improved knowledge. Alignment issues require reshaping content so it maps to how customers actually ask. Execution gaps require changes to instructions, workflows, or how the agent progresses through a task.
This is what makes the loop effective in practice. Instead of reacting to individual failures or broadly adding more content, teams can target specific breakdowns and apply the right intervention. Over time, that leads to measurable improvements in resolution, efficiency, and customer experience, because changes are tied to real failure modes, not assumptions.
Learn more about how to use the dashboard here.
Improving an AI agent is ultimately an alignment problem: aligning real customer language and needs with the information and instructions the agent can draw on in practice.
In production systems, maintaining alignment is difficult. Customer requests are messy and multi-intent, knowledge is fragmented, and agent behavior depends on how information is retrieved and applied. Most failures come from gaps in coverage, alignment, and execution that only show up at scale. Improving performance requires more than adding content or tweaking prompts. It requires a systematic way to identify where coverage breaks down and tie fixes to real outcomes.
Regal Improve makes that alignment measurable and actionable. By organizing real customer conversations into recurring topics, measuring how well those topics are supported by knowledge and guidance, and tying coverage signals to operational outcomes, it gives teams a practical way to invest improvement effort where it will matter most.
Ready to optimize your agent coverage? Connect with our team.
Your AI agent may not have the right context for how customers actually ask for help. Customers often phrase requests unpredictably or combine multiple issues, leading to coverage gaps. Regal Improve helps you identify these gaps at a granular level by analyzing real conversations, surfacing recurring topics, and pinpointing where knowledge, prompts, or workflows are misaligned so you can fix them effectively.
Coverage refers to whether the agent has relevant knowledge and guidance to handle real customer requests. This includes knowledge base content, prompt-based Q&A guidance, and workflow instructions. Simply uploading documents is not enough. Coverage depends on how well those resources match real customer language. If alignment is weak, the agent may still underperform even with strong documentation.
Regal Improve analyzes real customer conversations and groups similar requests into recurring topics. Instead of looking at individual calls, it surfaces patterns across thousands of interactions. These topics reflect what customers actually say, not predefined categories. This allows teams to focus on frequent, high-impact issues rather than isolated edge cases.
Coverage is measured using semantic similarity, meaning it evaluates how closely customer requests match available knowledge and guidance. Scores range from 0 to 1, with higher values indicating stronger alignment. High coverage suggests the agent likely has relevant information, while low coverage indicates a gap. These scores help prioritize which topics need attention, but they are meant for guidance rather than absolute judgment.
Gaps usually fall into three categories: missing context, alignment issues, or execution problems. Sometimes the needed information does not exist at all, while other times it exists but does not match how customers ask questions. In other cases, the agent has the right information but fails to apply it correctly. Identifying the exact failure mode is key to fixing the problem effectively.
Ready to see Regal in action?
Book a personalized demo.



