Alex Rezvov

Posted on May 12 • Originally published at blog.rezvov.com

The Future of User Interfaces and the Role of AI

#aidrivendevelopment #architecture #contextengineering #exochat

The Scene

"Find me marathon shoes for the next race. Neutral stride, sixty kilometers a week, budget around two-fifty."

A minute later three options come back, each with reasons. I say "give me the second one, size 43." Another minute later: payment confirmation, delivery date, no five-step forms. I never went near a website. I never tapped a button in someone else's UI. I don't even have an account at this store.

This isn't science fiction. It's 2026. OpenAI and Stripe shipped the Agentic Commerce Protocol. Google and Shopify shipped the Universal Commerce Protocol. Visa, Mastercard, Amazon, Walmart, Etsy are on board. AI-driven traffic to Shopify is up 8x year over year, orders out of AI-powered search up 15x. There's an LLM between me and the store.

For the skeptics. What I described above already runs at maybe seventy or eighty percent. Inside ChatGPT you can buy from Etsy and onboarded Shopify merchants through Instant Checkout, which launched in February 2026. In March, OpenAI walked some of it back; for many Shopify merchants the actual checkout now happens on the merchant's storefront, inside an in-app browser. The agent also doesn't pull a personal fitness profile by itself yet. The rest closes over the next year or two.

This piece is about how the user interface is changing and where the LLM sits in that change.

Today's UI Is a Barrier

The "user interface" today is a pile of separate sites and apps. Online stores, dating apps, social networks, banks, insurance portals, hospital portals, government portals. Each one with its own menus, its own vocabulary, its own funnels. Every user has to figure each one out from scratch.

And most of what those products can do never reaches the user.

Microsoft's team found this when they were planning the 2007 Office redesign:

"A study had shown that about 90% of the feature requests for Microsoft Office were for features already in the product. One of the major design goals for Microsoft Office 2007 was making features easier to discover. People just didn't know what was already there." — John D. Cook, "Did the MS Office ribbon work?" (2009)

Users were asking for things that already existed. They just couldn't find them.

Twenty years on, the same problem is systemic enough that Gartner tracks a whole category around solving it: Digital Adoption Platforms. These are overlays that sit on top of business software and whisper hints to the user about what the software actually does. According to Gartner's Market Guide for DAP (September 2025), the DAP market grew around 28% year over year to $1.04 billion in 2024, with projected growth of 15 to 20% in 2025. An entire industry exists to tear down the UI barrier on software people have already paid for.

What I take from this data. People keep asking for capabilities that already exist in the product. The features aren't broken. The interface is a barrier, and you have to fight it to find what you need. A whole class of enterprise software exists just to take that barrier down.

What's Already Happening: Delegation to Agents

People no longer have to navigate every interface themselves. First came research delegation: ChatGPT, Perplexity, Claude doing the searching, comparing, reviewing on the user's behalf. Then autonomous browsing: Anthropic Computer Use, OpenAI Operator, Google Project Mariner. Agents driving browsers for you. In 2025 and 2026, e-commerce got direct agent APIs: two competing standards, ACP (OpenAI + Stripe) and UCP (Google + Shopify, with Amazon, Mastercard, Visa, Meta, Microsoft, and Walmart behind it).

Other verticals (dating, social, banking, healthcare, government) are still at "open API for partners," without dedicated agent protocols. E-commerce ran first; the rest will follow the same pattern.

The technology is in place. The infrastructure is being built right now. There's going to be an LLM intermediary between the user and the business. The traditional UI stops being necessary for everyday operations. It survives as an engineering debug tool, and in a handful of specific cases.

Two Surfaces of the Future

The intermediary has two forms of interaction with a business:

Agent-mediated. The user has a personal agent. The agent talks to the business through a programmatic API (REST, MCP, agent-to-agent protocols), negotiates, buys, signs, tracks. The user never sees the business's website, forms, or menus. From the user, the agent needs a goal and delegation rights. From the business, it needs an API and its documentation.

Personally, I don't see a reason to replace REST with something new for this. I'll write that up separately one day.

Single-window text-or-voice. When the user doesn't have a personal agent (an older person who never set one up; a public terminal at the airport or hospital), they end up in one window, shared across all their interactions. No sites, no apps, no navigation. They just say what they want, in plain language.

The fundamental difference between these two surfaces is who owns the conversation's context.

In the first case, the user's agent prepares and stores the context. The business just gives a clean API. What the user's goal is, what the next steps are, what parts of the history matter: all of that lives with the agent.

In the second case, the business prepares and stores the context. It has to remember who this user is, where they are in the dialogue, what facts have been collected, what prompt and what model fit this moment, when to move to the next state.

Which raises the question: what does "has to remember and switch" actually mean?

Why a Smart LLM Alone Won't Cover It

The natural answer is: feed everything into a long prompt, give the model the whole history, let it figure things out. Each new model release figures things out a little better. So why do you need a structural layer at all?

Five structural reasons why "just a smart LLM" doesn't get you there. None of them go away with a better model.

Auditability. Regulators (SOC 2, HIPAA, PSD2 SCA, GDPR Article 22) expect a deterministic, replayable path through the sensitive parts of a conversation. "The strong model made the right call" isn't the kind of justification an auditor accepts without serious follow-up questions. A state graph gives an explicit path, tied to policies, facts, and transitions. It cuts the cost of an audit a lot.

Editability without a developer. A compliance officer should be able to change one state without involving engineering. In a megaprompt, that kind of edit is a risk of regressing everything else. In an FSM it's a local change with an explicit blast radius.

Personalization without leaks. In the LLM-only approach, all per-user facts usually sit right in the system prompt or in the context window. The model sees everything about every active user and gets to decide what to use. One slip (a jailbreak, a prompt injection, a routine hallucination) and facts from one session show up in another. A state machine inverts that: facts live in a fact store outside the prompt, each state declares which facts it needs, and only those facts go into the prompt for the current turn. The model physically doesn't have the data the current state isn't entitled to. PII leakage is its own category on the OWASP LLM Top 10 (LLM02:2025), which jumped to #2 in the 2025 revision (it was #6 in 2023).

Cost and latency. A long prompt that "knows everything" doesn't scale. Every turn pays for re-loading the whole history. Per-state assembly gives each step the minimum it needs. That's parsimony as an architectural requirement, not as an optimization.

Operator-managed evolution. When the business process changes, the FSM changes: one state, one transition, one prompt. Without the structural layer, any change to the business process means a developer-grade release of the megaprompt, then re-validating and re-testing the whole thing.

A structural layer is needed regardless of how smart the model gets.

Under the Hood: A State Machine for a Marathon Shoe Store

Let's take the scene from the top of the article and walk a customer through the full life cycle, from first visit to a product exchange. I'm picking states at the level of life-cycle phases, not micro-steps. That's where an LLM without an FSM actually starts losing context and dropping policies.

Sign-up and authentication. What this looks like in the LLM-intermediary world is genuinely an open question. Maybe identity gets confirmed through the user's agent provider, and a separate account at every shop is no longer required. Maybe guest checkout with email survives. Maybe a unified identity layer shows up. Whatever form wins, the FSM still needs a state here: this is where the shop decides who's on the other end and which policies apply from here on.

Shopping and cart. "Find me marathon shoes, about sixty kilometers a week, budget around two-fifty." State shopping. The LLM runs the conversation: asks what matters (cushion, pronation, current shoe), filters the catalog, presents options with reasoning. Policies specific to commerce live on this state. For example: don't recommend a model with a drop under 4 mm if the customer's history shows a stress fracture. That policy isn't smeared across a giant prompt; it's bound to shopping and active only here. When the user says "give me the second one, size 43," the choice goes into the fact store, and the FSM checks the exit conditions on shopping (stock, shipping, price agreement) before transitioning to checkout.

Checkout. State checkout. Payment method, shipping address, final price, explicit confirmation. Transition into payment_processing with the external provider, then state payment_confirmation waits on a webhook. For high-value orders, step-up authentication kicks in automatically. Every transition is written to the audit trail. Only after a successful charge is recorded does the state move to order_placed.

Post-purchase follow-up (proactive). Two weeks after delivery, the FSM starts the conversation itself: "how are the shoes treating you?" State post_purchase_followup collects feedback and classifies the answer. All good → case closed. Neutral → feedback goes to analytics. A problem → the FSM moves to support_request.

Support request (reactive). "They're rubbing me on the heel." State support_request. The LLM gathers more facts (for how long, photos if needed, wearing patterns), the FSM classifies the case: warranty defect, wrong model, wrong size, simple break-in. Based on the classification, it transitions to one of the terminal states: dismiss_with_advice, refund, or exchange.

Exchange with a human in the loop. State exchange. The FSM doesn't try to settle this on its own. It calls a tool: request_back_office_review(case_id, reason, photos, customer_history). The case goes into the shop's back-office queue, the FSM moves to awaiting_human_review, and the conversation with the user pauses ("we've got this, we'll be back to you within the day"). A real person on the shop's side (warehouse, customer service) looks at the case, approves it, asks for photos, or declines, and their decision comes back through a webhook. The FSM picks up the conversation with the user from there: "exchange approved, the courier picks up the old pair on Thursday." That last bit is unique to a state-machine-class system: automation → tool call → human → callback → automation, all inside one uninterrupted conversation with the user.

If a regulator asks "why didn't this user get the mandatory disclosure at checkout?", the shop can show the trace: this state, this validator, this fact, this result. The same shop on the same LLM would work without the state graph, but the auditing and debugging would have to happen by hand, reading through long prompts and message logs.

A CMS for the Post-Website World

The analogy I carry: what CMS was for the web, an FSM engine is for post-web dialogues.

CMS defined page types, content models, navigation, permissions, publishing workflows. The website was the product; the CMS was the invisible backbone.

An FSM engine defines states, fact schemas, transitions, policies, escalations. The dialogue is the product; the FSM is the backbone.

The main thing they share: both layers define structure statically, ahead of time. A page in a CMS exists until someone edits it. A state in an FSM and the instructions inside it are fixed. That's exactly what gives you determinism and auditability: the operator knows what instructions the model will receive when the user lands in state X. Not "the model will figure it out," but "these are the exact instructions for this state."

There's dynamic stuff too: which state the user is in right now, which facts get injected, which slice of history matters this turn. But the skeleton inside which that dynamism happens is fixed. That's the guarantee, not a limitation.

What This Means for People Who Build Products

The unit of design is no longer a screen. It's a state. The UX designer becomes a conversation architect: they design the state graph, write per-state prompts, define transition validators, sketch fact schemas. It's parallel to how UX moved from page-by-page design to component systems in the 2010s. Now the components are conversational states.

I know what I'm writing about because I'm building exactly this kind of system: ExoChat, an engine for managed LLM dialogues.

One more thing I noticed while writing this piece: the same argument runs recursively. The agents themselves (ChatGPT, Perplexity, Claude, Operator) could use an FSM skeleton on the inside. I'll write that one up separately too.

One disclaimer before we wrap: not all UIs disappear. Creative tools (Figma, Premiere, Logic Pro), code editors (IDEs), and games stay canvas-based. The LLM intermediary still shows up alongside them: AI agents inside Figma, Copilot, Claude Code, and Cursor in IDEs, generative pipelines and LLM-driven NPCs in gamedev. The canvas as a way to work isn't going anywhere; voice and text won't replace it. This article isn't about those domains. It's about the interaction between a user and a business in pursuit of a product, a service, or information.

What I'm Taking Away

REST APIs as the default agent surface. That's already the standard in everything we build. The article just confirms the direction.
Pilot ExoChat against a real online store. It's both a real test of the single-window surface and a strong demo.
And you? What does this change for the product you're building?

Originally published: The Future of User Interfaces and the Role of AI — Alex Rezvov's Blog