Every time we introduce a new bit of AI kit someone asks: “How will we know if it’s working?” It’s usually shorthand for: “Will this make us faster/better/cheaper?”
Actually, that’s not strictly true: sometimes people don't ask. Sometimes people just want to try out something new, demonstrate they’ve taken that action from the board to explore AI deployment, and worry about the impact later, when they’ve got a better feel for how it works and what it might do– but we like to bring folks back to the question anyway. We’re a curious people, and it helps keep everyone focussed on digital transformation as a means to an end, not an end in itself.
But if you’ve been in the room for that conversation, you’ll know it’s usually shorthand for: “Will this make us faster/better/cheaper?”– then it’s quickly followed by a conversation about whether people are going to be comfortable with those objectives and how we measure them.
The thing is, good AI isn’t always neat or comfortable for everyone, and it doesn’t just speed up tasks – it changes them. And that means traditional KPIs often miss the point. Let me explain.
The wrong measures come easy
When we talk to organisations about AI adoption, it’s tempting to default to things that are easy to track:
These things are useful to a point, but they only tell part of the story. An inevitable burst of early usage usually reflects curiosity, not impact. 100% accuracy might be less relevant than usefulness, relevance and reliability – and what are you comparing it to? Were the AI-generated meeting notes 100% accurate, or were they, probably more fairly, at least as good as a person would have made them. Were they less open to bias and lapses in concentration, easy to edit, and did they free up resource to do something else instead?
On top of that, any tool might be technically brilliant but end up quietly sidelined if it doesn’t feel safe or relevant to the team. There’s always the risk that you set it up and announce it in such a way that it’s just… off-putting. Creepy, even. Gen AI can be that way.
Reframing the question
As well as “is it accurate?” or “are they using it?” we like to ask:
✅Does it make people more confident?
✅Does it give them more time and space to think?
✅Does it help them ask better questions?
✅Are they using it in ways we didn’t expect?
A recent blog from Platforms & Algorithms, “The Many Fallacies of ‘AI Won’t Take Your Job, But Someone Using AI Will’”, captures this thinking beautifully:
“Improving a process that AI will soon eliminate is a misallocation of resources. The real advantage is not in making existing workflows faster, but in being first to build the new ones that won’t need those steps at all.”
If the workflow you’re optimising is already on the way out, making it 30% faster is not the win you think it is. It’s just the beginning.
In fact, we’d go further: Is your AI driving better retention? Job satisfaction? …wellbeing?
What we can measure
One of the things we’re able to do with Leading AI RAG solutions is see how tools are being used: the prompts people write, the responses they get, how long it takes, what it costs. We don’t connect that information to individual users – that gives us the same ick (and legal issues)as reading their mail without permission. But at scale that usage data means we can:
We let our private GPT help us work out the time savings, but when we test the logic on our customers, their view is that the assumptions about time savings estimates are, if anything, conservative.
Take social workers; their time could not be more precious. Over 900 of them in North Yorkshire now use a private AI, trained on all the policies and procedures they need to follow. Faced with a complex case or issues they hadn’t dealt with before, they would have previously had multiple conversations with colleagues to gather input and understand best practice, each time they needed to plan their next steps.
We know this because we asked a bunch of them. What we didn’t hear was “First, I search on a file name or I look in the folder structure for the correct documents.” Meanwhile, the clock is ticking.
Now, they ask their AI policy buddy (Polly), and get an immediate answer, along with links to relevant sources they can double-check. Polly resolutely refuses to tell them what to do, but it will tell them what to consider, what steps are required, and where exactly to check the rules.
That’s not just faster (although it is, let’s be clear, considerably faster); it’s empowering. It’s less dependent on who’s available and how easy you find it to ask the question, and it makes it more likely you’ll follow the latest advice on best practice rather than repeat the pattern of whatever happened before. It reduces the cognitive load all round. And it creates space for something much more interesting.
The 80% time saving on each query quickly adds up, but the real impact is the potential for more informed conversations, a better outcome and a stronger workforce.
Because what’s truly exciting isn’t just what the AI replaces (do you think social workers ever get to the end of their ‘to do’ list– with or without AI?) — it’s what it unlocks.
Time reimagined
We’ve watched people use AI not just to complete the task they had in mind, but to go beyond it. They’re:
This isn’t just time saved. It’s practice improved. It’s professionals making space for reflection, discussion, creativity, and learning. It’s also people asking: what else could this help with?
And that, really, is what we should be measuring. Smart, targeted RAG AI solutions are a gateway to more innovation; everyone asks us, “What’s next?”
What you measure matters — but so does how you measure it
Here’s the tricky part. Even with good intentions, the wrong KPI will backfire. People don’t respond to metrics as neutral observers; we adapt to them. We sometimes game them. We often fear them. We sometimes contort our behaviour just to hit them.
The behavioural science here is well-established so it’s worth keeping in mind as you go. You will already know how easily some metrics can trigger behaviours that aren’t getting the best outcomes. Performance incentives must be paired with context and discretion.
In practice, a target that drives perverse behaviours looks like:
I could write a whole blog about how to measure impact and promote psychological safety – which is what you need for high performing teams - but this is meant to be about AI and I’m getting off topic.
(Re)enter AI – and better feedback loops
This is a space where AI can quietly do something brilliant.
Framing good survey questions to capture the more nuanced benefits of new tech is hard. Analysing open-ended feedback is harder. But that’s what you need to get to the right information on impact, and with generative AI you can:
…and you don’t have to wait six months for a formal review. You can build fast, lightweight loops that tell us how people are feeling about their new tool – not just review what they’re typing into it and say, “Ooooh, interesting.”
So… how do you capture that?
Not with a dashboard of numbers. Definitely not with traffic lights. Almost never with a single metric; measures should always come by the basket. But maybe with a few better questions:
✅Are people still using the tool six weeks in – or has it become background noise?
✅Have they started tweaking prompts to get better results?
✅Do they talk to colleagues about it?
✅Is it showing up in unexpected places (training materials, case notes, 1:1s)?
✅Do people feel more confident in their decisions – and less alone?
In other words: are things changing for the better?
If the answer is yes, then: yes – it’s working.
We know that doesn’t fit into a quarterly KPI report. You can still pop those time savings and usage tables in there for now – just be ready to turn the conversation quickly to: how are we using that time?
Bonus honesty content
We made some mistakes with this stuff when we started out, so we hope you can learn from them.
First, we were very focussed on time savings as the main impact – and they are significant, probably even greater than the analysis suggests, once you account for the full amount of time people spend doing things the old-fashioned way. But if you’ve read this far, you already know that’s not the full story. We're all about the other benefits too now.
Second, one of our AI tools – our second most-used tool now – was in such popular demand that we rolled it out a couple of months early. And everyone was so delighted that they didn’t really want to stop and tell us how they felt before and after using it. We missed out on the chance to gather some survey gold.
The lesson: baseline the KPIs you want to monitor fast – even if it feels premature. Then go and talk to people. Ask how it feels. Record the call (with permission), and drop the transcripts into an AI tool for analysis. That way, you won’t miss out on the choice quotes, the human insight, or the data that shows just how wildly successful your AI project has been.
Explore our collection of 200+ Premium Webflow Templates