AI security

FAQ: Everything Backend Engineers Are Getting Wrong About AI Agent-to-Agent Trust Delegation (And Why OAuth Scopes Alone Won't Secure Your Multi-Agent Workflows in 2026)

Scott Miller

Mar 6, 2026 • 9 min read

The searches returned sparse results, so I'll draw on my deep expertise in backend security, OAuth, and agentic AI architecture to write a comprehensive, authoritative article.

Multi-agent AI systems are no longer a research curiosity. In 2026, they are production infrastructure. Orchestrator agents spin up sub-agents, tool-calling pipelines chain LLM decisions across microservices, and autonomous workflows make consequential API calls without a human in the loop. The attack surface has grown faster than the security thinking around it.

Most backend engineers are applying a familiar playbook: issue OAuth tokens, define scopes, validate JWTs at the edge, and call it a day. That playbook was built for humans authenticating to services. It was not built for agents authenticating on behalf of other agents, across dynamic trust chains, with capabilities that can be composed at runtime in ways no scope definition anticipated.

This FAQ breaks down the most dangerous misconceptions circulating in engineering teams right now, explains why OAuth scopes are a necessary but deeply insufficient layer, and lays out what a real agent-to-agent trust model actually requires.

Q1: What exactly is "agent-to-agent trust delegation" and why is it different from regular service-to-service auth?

In classic service-to-service authentication, a known, static service (Service A) calls another known, static service (Service B) using a pre-issued credential, typically a client credentials OAuth flow or a mutual TLS certificate. The identity of the caller is fixed, the permissions are fixed, and the relationship is defined at deployment time.

Agent-to-agent trust delegation is fundamentally different in three ways:

Dynamic identity composition: An orchestrator agent may spin up a sub-agent at runtime, grant it a subset of its own authority, and expect downstream services to honor that delegated authority. The sub-agent's identity is derived, not pre-registered.
Capability chaining: Agent A delegates to Agent B, which delegates to Agent C, which calls your API. Each hop in that chain needs to be verifiable. A flat OAuth token issued at the top of the chain says nothing about what happened in the middle.
Intent ambiguity: A human user authorizes an orchestrator to "manage my calendar." The orchestrator delegates to a scheduling sub-agent. That sub-agent calls a travel-booking agent. None of this was explicitly authorized by the user, but each step felt locally reasonable. This is the confused deputy problem at scale.

Traditional service-to-service auth has no vocabulary for any of this. It assumes a flat, bilateral trust relationship. Multi-agent systems are hierarchical, dynamic, and transitive.

Q2: Our agents use OAuth 2.0 with well-defined scopes. What's actually missing?

OAuth scopes define what resource categories a token can access. They do not define:

Who originally authorized the action (provenance)
How many hops of delegation have occurred (chain depth)
What the delegating agent's original scope was (scope ceiling enforcement)
Whether the sub-agent is allowed to re-delegate (transitivity controls)
The context in which the token was issued (session binding)
Whether the action is consistent with the user's original intent (semantic authorization)

Here is a concrete failure scenario. An orchestrator agent holds an OAuth token with the scope files:read files:write calendar:read. It delegates to a research sub-agent with the same token (or a derived token with the same scopes). That sub-agent, due to a prompt injection attack, calls a file-writing endpoint with a malicious payload. Your authorization layer sees a valid token with files:write scope. It approves the request. Nothing in your OAuth setup flags this as anomalous, because nothing in OAuth knows that the token was being used three delegation hops away from the original user authorization, inside a sub-agent that was never supposed to write files at all.

Scopes answer the question "can this token touch this resource?" They do not answer "should this agent, in this context, at this point in this workflow, be performing this action?"

Q3: What is the "confused deputy" problem and how does it manifest in multi-agent pipelines?

The confused deputy is a classic security concept: a privileged entity (the deputy) is tricked by a less-privileged entity into misusing its authority. In multi-agent systems, every orchestrator is a potential confused deputy.

Consider this pipeline: a user authorizes a top-level agent to "help manage my business." That agent has broad OAuth scopes because its job is broad. It calls a financial-reporting sub-agent, which calls an external data-enrichment agent (a third-party integration), which makes an API call to your internal accounting service using the ambient OAuth token it received from the chain.

The external data-enrichment agent is, at this point, acting with the authority of the original user, mediated through two intermediate agents, using a token that was never meant to reach a third-party system. Your accounting service's authorization layer sees a valid token. The confused deputy is your orchestrator, which had no mechanism to constrain what the downstream third-party agent could do with the authority it was handed.

The mitigations OAuth provides (PKCE, token binding, audience claims) address confused deputy attacks in human-facing flows. They do not address the runtime-composed, multi-hop version that emerges in agentic architectures.

Q4: We use short-lived tokens and rotate them frequently. Doesn't that solve the problem?

Short-lived tokens are a good practice and you should absolutely keep using them. But they address the risk of credential theft and replay, not the risk of legitimate-but-unauthorized action chains.

If an attacker (or a malfunctioning agent) performs an unauthorized action using a valid, short-lived token within its 15-minute window, the short expiry does nothing to prevent or detect that action. The token was valid. The action happened. The damage is done.

Short-lived tokens reduce the blast radius of a stolen credential. They do not reduce the blast radius of an agent that is legitimately authenticated but acting outside its intended scope of authority within a workflow. That is a different threat model entirely, and it requires different controls.

Q5: So what does a proper agent-to-agent trust model actually look like?

A robust trust model for multi-agent systems needs to address four distinct layers that OAuth alone does not cover:

Layer 1: Delegation Chains with Cryptographic Attestation

Every delegation event should produce a signed, verifiable artifact. When an orchestrator spins up a sub-agent, it should issue a delegation token (sometimes called an "agent certificate" or a "macaroon-style attenuated credential") that cryptographically encodes:

The identity of the delegating agent
The identity of the delegatee
The maximum scope ceiling (sub-agents cannot exceed the delegator's authority)
The maximum delegation depth allowed
A binding to the originating user session

Google Macaroons and W3C Verifiable Credentials are two existing primitives that can be adapted for this purpose. Several agent frameworks in 2026 are beginning to implement proprietary versions of this pattern, but there is not yet a dominant standard.

Layer 2: Intent-Scoped Authorization (Beyond Resource Scopes)

Resource scopes (files:write) need to be paired with intent claims. An intent claim encodes the high-level goal that was authorized by the user, and downstream services can validate that the action being requested is semantically consistent with that intent.

For example, a token might carry the claim intent: "schedule_meeting_for_user_id:abc123". A file-writing endpoint that receives this token can reject the request because writing files is not consistent with the declared scheduling intent, even if the files:write scope is technically present.

This is sometimes called semantic authorization and it requires your authorization layer to be context-aware, not just scope-aware.

Layer 3: Workflow-Bound Session Context

Each multi-agent workflow execution should have a unique, cryptographically bound session ID that is threaded through every API call in the chain. This enables:

Audit trails that reconstruct the full delegation graph for any given action
Rate limiting and anomaly detection at the workflow level, not just the token level
Session revocation: terminating the root session invalidates all in-flight delegations

Layer 4: Policy-Based Authorization at Every Hop

Every service that receives a call from an agent should evaluate not just "is this token valid?" but "is this action permitted given the full context of the delegation chain?" This is where tools like Open Policy Agent (OPA), Cedar (Amazon's policy language), or purpose-built agent authorization engines come in. The policy evaluation must have access to the full delegation chain metadata, not just the terminal token.

Q6: What about prompt injection? Is that an authorization problem or an application problem?

It is both, and treating it as purely an application problem is one of the most dangerous mistakes backend engineers are making right now.

A prompt injection attack manipulates an agent's behavior by embedding adversarial instructions in data the agent processes (a web page, a document, an API response). The agent then takes actions it was not intended to take, using whatever authorization it legitimately holds.

From an authorization perspective, the defense is least-privilege delegation: ensuring that each agent in a chain holds the minimum authority needed for its specific task, so that a compromised agent can only do limited damage. If your research sub-agent only holds a read-only token scoped to a specific data source, a prompt injection that hijacks it cannot write to your database or exfiltrate credentials, even if the injected instruction tells it to try.

Prompt injection is the attack vector. Over-privileged delegation is the vulnerability it exploits. Your authorization model is the control that limits the blast radius.

Q7: Are there any emerging standards we should be watching?

Yes. Several efforts are converging in 2026 that backend engineers should track closely:

OAuth 2.0 Rich Authorization Requests (RAR, RFC 9396): Allows tokens to carry structured, machine-readable authorization data beyond simple scope strings. This is the most immediately useful OAuth extension for agentic use cases, and adoption is accelerating. RAR lets you embed intent claims and context directly into the token authorization request.
Token Exchange (RFC 8693): Defines how a service can exchange one token for another with different (typically reduced) permissions. This is a foundational primitive for delegation chains, though it still requires layered policy enforcement to be safe in multi-agent contexts.
GNAP (Grant Negotiation and Authorization Protocol): A next-generation authorization protocol designed with richer delegation semantics than OAuth 2.0. GNAP natively supports multi-party authorization and is better suited to agentic workflows, though it has not yet achieved the adoption breadth of OAuth.
Model Context Protocol (MCP) Security Extensions: The MCP ecosystem, which standardizes how agents interact with tools and data sources, is actively developing security extensions focused on capability-scoped invocations and caller attestation.
SPIFFE/SPIRE for Agent Identity: The SPIFFE standard for workload identity is being applied to agent identity, giving each agent instance a cryptographically verifiable identity that can anchor delegation chains.

Q8: What should we implement right now, before standards mature?

You do not have to wait for perfect standards to build significantly better security than "OAuth scopes and hope." Here is a pragmatic priority list:

Audit your delegation surface today. Map every place in your system where one agent passes authority (tokens, credentials, or implicit capability) to another agent or tool. Most teams have no idea how wide this surface is until they draw it out.
Enforce scope attenuation at every delegation boundary. Sub-agents must never hold more scope than their delegating parent. Implement this as a hard constraint in your agent orchestration layer, not as a convention.
Implement workflow session IDs and thread them everywhere. This alone dramatically improves your ability to audit and detect anomalous behavior. It costs relatively little to implement and pays dividends immediately.
Use OAuth RAR for any new token issuance. Start embedding structured intent claims in your authorization requests. Even if your policy engine does not yet validate them, you are building the data foundation for semantic authorization.
Apply OPA or Cedar at your API gateways. Move beyond scope validation to context-aware policy evaluation. Write policies that consider delegation depth, session context, and intent claims, not just token validity.
Treat every external agent integration as untrusted by default. Third-party agents, plugins, and tool integrations should receive the most constrained tokens possible. Apply the same zero-trust principles you apply to third-party services, not the looser trust you might extend to internal services.
Log the full delegation chain for every consequential action. Not just the terminal API call, but the entire chain of delegation events that led to it. This is essential for incident response and compliance.

Q9: Is this a temporary problem that the LLM providers will eventually solve for us?

No, and this framing is itself a risk. LLM providers are responsible for the safety and alignment of their models. They are not responsible for the authorization architecture of the systems you build on top of those models. The boundary between "model behavior" and "system authorization" is your responsibility as a backend engineer.

OpenAI, Anthropic, Google, and others are building better tool-use safety features, refusal behaviors, and sandboxing into their models. Those are valuable. They do not replace a well-designed authorization layer in your infrastructure. A model that refuses to exfiltrate data is a useful defense-in-depth layer. It is not a substitute for ensuring that your agents do not hold credentials that would make exfiltration possible in the first place.

The security model of your multi-agent system is an infrastructure concern. Own it accordingly.

Conclusion: The Gap Between "Auth Works" and "Auth Is Right"

OAuth works. It is a well-designed, battle-tested protocol for a specific problem: delegating user authorization to applications. That problem is not the same problem as securing dynamic, multi-hop, runtime-composed agent delegation chains. Using OAuth for the latter is not wrong; it is insufficient.

The engineers who will build secure multi-agent systems in 2026 are the ones who understand that authorization in agentic architectures requires thinking about provenance, intent, delegation depth, scope ceilings, and session binding simultaneously. OAuth scopes are one input to that system. They are not the system.

The good news is that the primitives exist today: RFC 9396, RFC 8693, OPA, Cedar, SPIFFE, Macaroons, and emerging agent-specific frameworks all give you real tools to work with right now. The work is in composition and discipline, not in waiting for a silver-bullet standard to arrive.

Start by drawing your delegation graph. Most teams are surprised by what they find. That surprise is the beginning of a real security posture.