Austin Vance for Focused

Posted on May 13 • Originally published at focused.io

AI Agent Authentication Starts With Workload Identity | Focused Labs

#webdev #ai #programming

AI agent authentication starts when the system can answer which actor is allowed to make a tool call.

The model can propose the action. The runtime has to attach authority to it.

Most teams start with the fastest answer: an API key in an environment variable. The agent reaches Salesforce, GitHub, Jira, Snowflake, Stripe, whatever system makes the first useful proof feel real, and everyone moves on.

That proof matters. It shows the agent can reach the systems where work actually happens. It also hides the first product decision: who is acting when the tool call leaves the runtime?

The agent gets memory. The agent runs in the background. The agent forks into subagents. The agent retries failed operations. The agent calls tools after the user has walked away. The agent lands in an enterprise workflow where the work has value, the logs have value, and breaking something has a consequence.

A shared API key starts as configuration. Then it quietly becomes the identity of the agent.

An ugly place to stumble into by accident.

The secret becomes the actor

Early security models for agents tend toward good vibes with a bearer token. The prompt gives instructions. The tool schema lists calls. Hard-coded secrets in the runtime decide what actually gets done based on the input, the agent, and whatever authority those secrets carry.

The secret wins.

The agent has all of those powers if the same key can read every customer record, submit refunds, update tickets, and write to production data. Carefulness in the prompt is theater at that point. The tool description can say those powers apply only when appropriate. The audit log will still show one credential able to perform a pile of different tasks.

There is already a category for this outside agents: OWASP's Non-Human Identities Top 10. Production applications identify themselves as non-human identities. Agents are adding themselves to that growing list of stranger workloads, running differently than normal services, but still requiring access to systems and data.

The important step for me is naming the agent as a workload, because the architecture gets less magical and more useful.

Workloads have identities. Workloads can request scoped credentials for those identities. A workload can be denied a credential. A workload can rotate credentials. A workload can leave an audit trail that survives the model, the prompt, and the v2 or v3 abstraction barrier the team is currently working around.

Baseline authentication for production AI agents.

The runtime should issue tool-specific credentials instead of letting the agent carry a shared key everywhere.

Workload identity is the boring answer

This part is old. Good.

Kubernetes already considers service accounts to be identities of processes running in Pods, and the current docs describe short-lived, automatically rotating ServiceAccount tokens issued through the TokenRequest API. SPIFFE generalizes that into workload identity documents, including short-lived X.509 and JWT SVIDs that a workload can use to authenticate itself to other workloads.

Cloud platforms are heading in the same general direction. AWS STS can issue temporary security credentials after a workload has identified itself using OpenID Connect. Google Cloud Workload Identity Federation allows external workloads to access Google Cloud resources without service account keys. Azure managed identity docs describe workload identities as machine and non-human identities associated with compute resources.

The industry knows how to keep long-lived secrets out of the hot path. It just keeps giving agents interfaces that make the old mistake easy.

A developer writes a tool wrapper. The tool wrapper needs credentials. The fastest way to configure it is to add an API key to an environment variable and add a TODO to remove it later. The TODO gets pushed to production because now the agent answers support tickets, reconciles invoices, or looks at CI.

I've worked with teams who reviewed the model, tuned prompts, drew diagrams for tool selection, created a few secrets in deploy config, and crossed their fingers that the tool descriptions would shore it all up.

They are not enough.

Delegation is the missing primitive

In many applications, the agent should rarely hold the credential it uses to act.

Put an identity assertion in the flow. This agent. This tenant. This user context if present. This policy version. This tool request. This approval state. That assertion is exchanged for a credential only when the action needs one.

OAuth was designed to support exactly this shape. RFC 8693 defines token exchange, describing how one temporary credential can be exchanged for another temporary credential intended for a different context. In the agent case, the model proposes an action, the runtime checks policy, the broker issues a credential for that action and tool context, the call happens, and the credential dies.

It does not expire after a quarter. It does not expire after someone remembers to rotate it. It expires because the system puts expiration in the path.

That changes the damage pattern. A compromised tool wrapper no longer implies broad access to every downstream system. A prompt injection has to cross approval, run, tenant, and policy boundaries. A subagent that escapes its execution boundary cannot reuse credentials after the run, approval, or tenant context has expired.

The agent is still useful. It just has to query through a production boundary that understands production concerns.

This is why integrated agents are valuable and dangerous at the same time. The valuable integrated agents do not live in a chatbot tab. They integrate with real systems. Once an agent is tied to real systems, authentication becomes product architecture rather than cleanup work hidden in deployment.

The runtime owns the identity boundary

A model provider should not own this boundary. A prompt should not own this boundary. A tool schema should not own this boundary.

The runtime owns it because the runtime follows the whole path.

It connects agent definitions to threads or runs, tenants, and identity information, including the user who initiated the work, whether the work is backgrounded, whether a human approved a risky step, which tool is being called, and which downstream credential is being requested. It can attach those facts to an identity assertion and make a policy decision before any assertion leaves the process.

That policy decision can be boring and explicit:

The refund tool can request a payment credential for the current tenant.
A GitHub tool can request a write credential after CI has produced an eval pass.
The Snowflake tool can request a read credential for one warehouse, one role, and one time window.
A subagent can run with a delegated identity, but only with fewer capabilities than the parent run.

The list is not impressive, which is why it is powerful.

This is also where multi-agent orchestration gets serious. A supervisor handing work to a subagent creates a delegation relationship along with the task description. The child process needs enough authority to perform the work at hand and no more. The audit log must reflect that chain of trust cleanly or troubleshooting becomes an exercise in futility.

The worst setup is a swarm of agents all sharing the same service account. Simple enough to get going. Terrible when it comes time to debug an incident. Every action has been performed by the same principal, authenticated with the same key, and observed through the same useless blur.

The incident has no useful actor. Just a shared key with a long memory and no accountability.

Short-lived delegated credentials make the agent run, policy decision, tool call, and audit trail line up.

Audit follows identity

Agent observability without identity is half a story.

A trace for the agent step called refund_customer can include latency, tool arguments, model output, retries, all visualized in a convenient span tree. Useful. Then someone asks who had authority to issue that refund, and the trace turns into archaeological excavation.

The right trace shows the tool call connected to a principal. Not just a service account. A principal with an agent ID, run ID, tenant, user context, policy decision, credential scope, and expiration time.

This is what allows a team to answer questions after the tool call has done real work.

Who granted access? What user context did it use? What broker generated the credential? What version of policy allowed it? What downstream resource accepted it? What subagent inherited it? Can that credential be used for something else?

Those questions determine whether there is a real postmortem or just hand waving about the agent doing something weird.

The same principle applies to testing. In Everybody Tests, I argued that every team already tests whether they admit it or not. Agent identity needs that same honesty. If a runtime can create delegated credentials, tests should verify that the boundary holds. A refund agent should fail against the wrong tenant. A code agent should fail when eval gates are red. A research agent should fail when it asks for write access to a system it only reads.

Not a single npx this and that in the whole codebase. Test it in CI.

Shared keys hide product decisions

The fastest credential story hides the decisions that matter most.

A shared key hides tenancy. It hides user context. It hides the identity of the agent performing an action. It hides which subagent inherited authority. It hides whether approval was granted. It hides whether the action matched the original request. It hides rotation until rotation becomes an outage.

OWASP's secrets management guidance recommends dynamic secrets where possible to reduce credential reuse and limit the damage when credentials leak. Agent systems need the same pressure, with the additional constraint that the credential must represent the run instead of only the application.

A normal backend service is expected to behave predictably and follow a reliable lifecycle. It accepts requests, implements endpoints, and changes through controlled deployments. An agent runtime for integration automation can select different tools per request, execute work in subagents, retry steps, and continue running after initial user interaction has completed.

So identity has to be more exact.

The credential loaned to the system should assert what it is currently allowed to do. The operating policy should be visible enough to understand the motivation behind the action. The audit trail must persist long enough for a human to traverse the events as they happened.

A boundary-based platform does not need a full rewrite. Start with one boundary.

Put an identity broker between the agent runtime and the first high-risk tool. Give the agent runtime a workload identity. Have the broker exchange that identity for a tool credential. Associate the decision with tenant, run, and operation. Record the policy decision in the trace. Add a CI test that proves the wrong tenant fails. Expire the credential quickly. Make the failure visible when the broker returns no.

Then move the next tool behind the boundary.

The production line

AI agent authentication is the control plane for non-human actors who do work across systems.

Ownership matters here. Security cannot retroactively add this after the agent and its resources have shipped. Platform cannot stash it in a vault path. Product cannot mark it as a checkbox in consent. Identity, delegation, expiration, and audit have to be inherent in the runtime of the agent and how it executes.

The agent should actually be able to act. That is, after all, why we are doing AI agency in the first place. That agency should have a workload identity.

Production systems have already worked out parts of the problem. Kubernetes, SPIFFE, OAuth token exchange, cloud workload federation, managed identities, dynamic secrets. They exist because static secrets rot and shared principal accounts make bad worse.

It is a mistake to grant agents an exemption because the interface is conversational.

The model can decide on the next step. The runtime decides whether that step gets a credential.

DEV Community