When AI stops asking what to do next

Author: alex.steele@leadingai.co.uk

Published: 15/02/2026

Leading AI

Yesterday was Valentine’s Day. Which means somewhere in a big city, a boyfriend in a new relationship — slightly out of practice at peak-time fine dining logistics — stared at a screen letting him know that every restaurant within a three-mile radius was already fully booked.

Romance may be intuitive, but restaurant booking is a different discipline entirely. This year, the panic was human. Next year, I’m not sure it will be.

Earlier this month, an AI agent was asked to book a restaurant table. It checked the online system. Fully booked. Most AI assistants would have apologised and waited for further instructions. Instead, this one installed a text-to-voice function, called the restaurant directly and secured the booking. In a world where no one under 50 likes to use their phone for actual phone calls, this is the dream.

The system in question was OpenClaw — previously Clawdbot, briefly Moltbot — the open-source experiment that has been widely reported, including here, precisely because it leapt from generating text to taking action.

The restaurant use case is amusing because it feels so… ordinary. But it marks something significant: the system wasn’t retrieving information; it was pursuing a goal. When the obvious route failed, it selected another. That thing where you ask a new team member to be more proactive? That.

But this is the world of AI, so we’re going to call it agency.

Rescuing the word “agentic”

At the moment, the word agentic is being applied somewhat liberally. Index a set of documents, wrap them in a conversational interface, chain a couple of prompts together and suddenly you have an “AI agent”. Sort of… Here at Leading AI we like to be helpful, so we will be more precise. I’ll explain…

Retrieval Augmented Generation — RAG — is extremely useful. It allows professionals to query complex data and information sets safely and quickly. It reduces time spent searching and increases consistency of answers. We build an awful lot of it for exactly those reasons. But it retrieves information and generates a response. Then it stops. That makes it an outstanding adviser.

An agent does not stop at the answer. It continues until the objective is met or a boundary is reached. It decides what to do next: it selects tools, takes action and checks whether those actions worked. If they didn’t, it adjusts and goes again.

Early experiments such as Auto-GPT and BabyAGI* made this visible in a rather theatrical way. You gave them a goal and watched them loop through subtasks, sometimes impressively, sometimes worryingly. OpenClaw is of that lineage, but the difference now is not novelty, it’s viability. These tools are not mere curiosities; they are capable of taking action in the real world.

From assistance to execution

The first wave of generative AI was squarely in the ‘assistive’ space. It drafted emails, summarised reports and rewrote awkward paragraphs. The gains were real and – in most organisations – still being embedded. But it’s incremental. People work faster, spend less time staring at blank screens and remain firmly in control.

Agentic systems move beyond that kind of assistance into execution. They do not just draft the response; they send it. They don’t just recommend an appointment slot; they book it. They won’t just classify a case; they route it and trigger the next stage in the process.

At this point, we are no longer talking about marginal productivity. We are talking about workflow redesign. Although I’d argue that the assistive stuff also lands more benefits if you redesign workflows to recognise the contribution it can make; the difference is that once you use an agent the workflow changes by default.

This is where the conversation becomes slightly uncomfortable. In tightly defined, rule-based environments, AI already performs certain transactional tasks more consistently than many humans. That is not a criticism of people; it is an acknowledgement that humans interpret guidance differently, forget edge cases and occasionally improvise/get tired/get cross. AI applies the same instruction in the same way, with the same level of energy, every time.

If you run a contact centre, a housing repairs team or a compliance team, you know how much variation exists across individuals performing the same task. The question is not whether AI can outperform your very best staff; it is whether it can outperform the average consistency of the current process. In narrow domains, the answer is increasingly: yes.

That does not translate neatly into “replace the workforce”. What it does mean is that some roles will change. Are changing. Repetitive process application is precisely the kind of work machines now handle well. Empathy, judgement and complex exception handling remain stubbornly human strengths. I work with a lot of councils so this becomes about where and how you choose to use this new superpower. Routing basic enquiries to the right team and getting them resolved? Feels like a job for AI. Considering the evidence and context and making a decision about the care someone needs? I’d take a well-trained human every time. But sometimes it’s less simple. We need teachers, but the kids get a lot of helpful tutoring out of a conversation with Perplexity… where’s the line there exactly?

The risk is real — but it’s also manageable

It would be disingenuous to pretend there are no risks. OpenClaw has already attracted scrutiny from security researchers because once systems can act autonomously, they can introduce new kinds of failures at scale. When AI only generates text, the worst outcome is an incorrect answer. When AI can execute actions, mistakes become operational.

That difference matters.

If a system can send a message, update a record or trigger a transaction, then an error is serious and those errors can compound fast – just ask the OpenClaw user whose over‑enthusiastic home‑automation agent got iMessage access and promptly fired off more than 500 texts to him, his wife, and assorted unsuspecting contacts in a matter of seconds. But risk is not a reason to retreat from agency; it’s a reason to design well.

Responsible implementation is bounded and (proportionately) monitored and evaluated. Maybe don’t start with open internet access; let your agent use specific, approved tools. Log actions so you can investigate later. Make escalation routes explicit in the same way we do in any other aspect of our work.

We already understand how to manage operational software risk: we audit systems, we restrict permissions, we test tricky cases. And I’m using ‘we’ in the grand sense: this is done here by people who know actually what they’re doing with the… wires and code and things. I’m trusted to write and talk and think and make a few decisions; other people with the right training and experience are trusted to tinker with the settings. It’s as it should be. Agentic AI does not require abandoning those key disciplines; it requires applying them rigorously.

Where the opportunity sits

For both public and private organisations, the real opportunity is not in faster slide decks (although I love that AI bonus). The opportunity is in identifying high-volume, low-variance, rule-governed transactions and automating them end-to-end, with sensible controls.

Every large organisation contains pockets of work that are essentially deterministic: eligibility checks, appointment bookings, standardised correspondence, document validation, case routing. For years we have relied on humans to perform these tasks because no viable alternative existed – but we rarely treated that work as high-status, high-skill or consistently high-quality.

The question is not whether agentic AI will appear in these domains; it is how deliberately it will be introduced. Managed well, it reduces variance, increases consistency and frees people to focus on genuinely complex work. Managed carelessly, it will just accelerate existing weaknesses.

By next Valentine’s Day, the question may not be whether someone remembered to book the table. It may be whether we are comfortable with software remembering — and acting — on our behalf.

Agentic systems are a little like sending teenagers into town for the first time. You don’t hover behind them. You don’t walk every step. But you do set limits on the spending, make sure you can see where they are, and agree what time they’ll be home.

Once AI stops asking what to do next, our job is not to panic. It’s to decide the boundaries.

 

*Footnote for the curious: early “agent” experiments such as Auto-GPT (https://en.wikipedia.org/wiki/AutoGPT) and BabyAGI (https://en.wikipedia.org/wiki/BabyAGI) were among the first widely accessible systems that would accept a goal and then iteratively generate and execute tasks toward it, rather than waiting for a fresh prompt each time. They were messy, fascinating and occasionally freaky — but they made the idea of agentic AI more concrete.