Agent2Agent

7 Ways Backend Engineers Are Mistakenly Treating Google's Agent2Agent Protocol as a Secure Cross-Tenant Communication Standard (And Why It's Silently Destroying Tenant Boundary Enforcement in Multi-Tenant Agentic Pipelines in 2026)

Scott Miller

Mar 23, 2026 • 10 min read

Google's Agent2Agent (A2A) protocol arrived with enormous fanfare. Positioned as the lingua franca for autonomous AI agents to discover, negotiate with, and delegate tasks to one another, it quickly became the backbone of countless multi-agent systems built in late 2025 and into 2026. Backend engineers, already under pressure to ship agentic pipelines fast, latched onto A2A as a complete communication standard, one that would handle the hard parts of agent interoperability so they didn't have to.

There is just one critical problem: A2A was never designed to be a cross-tenant security boundary. It is a coordination and capability-discovery protocol. It tells agents what other agents can do and provides a structured way to invoke those capabilities. What it does not do, by default, is enforce who is allowed to talk to whom, which tenant's data flows through which pipeline stage, or whether the agent on the other end of a task delegation actually belongs to the same security domain as the caller.

In 2026, as multi-tenant agentic platforms have scaled from proof-of-concept to production workloads serving thousands of enterprise customers simultaneously, this misunderstanding has quietly become one of the most dangerous architectural anti-patterns in the industry. Below are the seven most common mistakes backend engineers are making, and a clear-eyed look at why each one is eroding tenant boundary enforcement in ways that are often invisible until something goes catastrophically wrong.

Mistake #1: Assuming A2A's Agent Card System Implies Authorization

The A2A protocol uses Agent Cards, JSON-formatted capability manifests that agents publish to advertise what they can do. When engineers first encounter Agent Cards, the natural mental model is: "If an agent has a card and it's discoverable, it's authorized to participate in my pipeline." This is a dangerous conflation of discoverability with authorization.

Agent Cards are authentication-agnostic by design. They describe capabilities, input/output schemas, and supported task types. They say nothing about which tenants are permitted to invoke those capabilities, under what conditions, or with what data classification levels. In a multi-tenant system, Tenant A's orchestrator agent can discover and invoke Tenant B's data-processing agent if both are registered in the same A2A discovery layer, and the protocol will not raise a single objection.

The fix: Treat Agent Cards as a capability catalog, nothing more. Layer a dedicated authorization service (ideally an OPA-based policy engine or a purpose-built agent authorization gateway) in front of every A2A task invocation. That service must validate tenant context, check that the calling agent's tenant identity matches the permissioned scope of the target agent, and enforce data classification policies before a single token of task payload is transmitted.

Mistake #2: Relying on Transport-Layer TLS as "Security" for Cross-Agent Calls

A2A is typically deployed over HTTPS, and many engineering teams check the "secure communication" box the moment they see TLS in their infrastructure. This is a classic case of conflating encryption with authorization. TLS ensures that the data in transit cannot be read by a third party on the wire. It says absolutely nothing about whether the sender is permitted to send that data, or whether the receiving agent should process it in the context of a specific tenant.

In multi-tenant agentic pipelines, the threat model is not primarily an external eavesdropper. It is a confused deputy attack in which one tenant's agent, acting with legitimate credentials at the transport layer, invokes a shared infrastructure agent and passes context that bleeds data from another tenant's workflow. TLS won't catch this. Neither will mTLS alone, because mutual certificate validation only confirms that both parties are valid participants in the system, not that this specific interaction is permitted within this specific tenant boundary.

The fix: Implement a tenant context token that is separate from and layered on top of transport security. This token, cryptographically signed by your identity provider, must be validated at every A2A task boundary. It should carry the originating tenant ID, the data classification scope, and a pipeline-run correlation ID that enables full audit tracing across agent hops.

Mistake #3: Treating A2A Task Context as a Trusted Propagation Channel

A2A tasks carry a context object that flows from orchestrator to sub-agent and can be enriched at each hop. Engineers building multi-tenant pipelines frequently use this context object to carry tenant metadata, user IDs, and even scoped API credentials. The assumption is that because the context was set by a trusted orchestrator at the top of the chain, it can be trusted by every downstream agent that receives it.

This assumption breaks down the moment you introduce third-party or community agents into your pipeline. A2A's open interoperability model is one of its greatest strengths for building rich, composable agent ecosystems. But it means that a downstream agent you did not write, and may not fully control, can receive, log, mutate, or re-propagate the tenant context object. If that context carries sensitive tenant identifiers or scoped credentials, you have effectively handed them to every node in the graph.

Worse, a malicious or compromised intermediary agent can perform a context injection attack, modifying the tenant ID or permission scope in the context object before passing the task downstream, causing subsequent agents to operate under a falsified tenant identity.

The fix: Never carry mutable tenant security context in the A2A task payload. Use short-lived, audience-restricted tokens (following the OAuth 2.0 Token Exchange spec, RFC 8693) that are issued per-hop by a central authorization server. Each agent must independently validate the token for its own audience before processing. The context object should carry only correlation metadata, never authorization material.

Mistake #4: Using a Single A2A Discovery Endpoint Across All Tenants

For convenience and operational simplicity, many platforms stand up a single A2A discovery service that all agents, across all tenants, register with and query. This creates a flat agent namespace in which the only thing separating Tenant A's agents from Tenant B's agents is a tenant ID field in the Agent Card metadata.

The problem is that a flat namespace with metadata-based separation is not a security boundary. It is a filter. Filters can be bypassed, misconfigured, or simply omitted by a developer who is iterating quickly and forgets to include the tenant scoping parameter in a discovery query. When that happens, the discovery layer happily returns agents from every tenant in the system, and the calling orchestrator may invoke one without ever realizing the tenant boundary violation.

This is particularly acute in platforms that use agentic routing, where an orchestrator dynamically selects the best agent for a task based on capability matching. If the capability matching query is not strictly scoped to the calling tenant's agent pool, the router can silently cross tenant boundaries in pursuit of the best capability match.

The fix: Implement tenant-scoped agent namespaces at the discovery layer. Each tenant's agents should be registered in a logically (and ideally physically) isolated namespace. Cross-namespace discovery should require an explicit, audited federation agreement, not a missing query parameter. Consider deploying per-tenant A2A discovery sidecars rather than a single shared discovery service.

Mistake #5: Ignoring the Streaming Task Model's Audit Gap

A2A supports long-running, streaming tasks in which an agent sends incremental updates back to the caller over time. This is essential for complex, multi-step agentic workflows that can run for minutes or hours. However, most tenant boundary enforcement implementations are designed around the request-response model: validate at invocation, return a result, done.

In the streaming model, a task invocation is validated once at the start, but the stream of updates that follows can run for an extended period, potentially across tenant context changes, token expirations, or even tenant offboarding events. Engineers frequently assume that because the initial invocation was validated, the entire streaming session is safe. This creates an audit and enforcement gap that can last for the entire duration of a long-running agent task.

Consider a scenario where a tenant's subscription is terminated mid-task, or where a security incident triggers a tenant isolation lockdown. If your enforcement logic only fires at task initiation, the streaming agent task continues happily, processing and potentially exfiltrating tenant data long after the tenant's access should have been revoked.

The fix: Implement continuous authorization for streaming A2A tasks. Use a token introspection endpoint that streaming agents call at regular intervals (or at each meaningful state transition) to confirm that the originating tenant context is still valid and authorized. Treat the streaming session as a series of checkpointed authorization events, not a single gate at the start.

Mistake #6: Misunderstanding A2A's "Push Task" Model as an Authenticated Callback

A2A supports a push-based task model in which an agent can proactively push task results or new tasks to another agent's registered endpoint, rather than waiting to be polled. Many backend engineers implement this as a webhook-style callback and assume that because the receiving endpoint is registered in the A2A directory, any push arriving at that endpoint from a known A2A participant is legitimate and tenant-appropriate.

This is a classic SSRF and confused deputy vector. An attacker who can register an agent in the A2A discovery layer (or who controls a compromised agent already registered there) can push crafted task payloads to any other agent's registered endpoint. If the receiving agent trusts the push because it came from "a valid A2A participant," it may process that payload within the context of whatever tenant session is currently active, effectively injecting cross-tenant work into a live pipeline.

The A2A specification provides hooks for signing task payloads, but as of 2026, the majority of production implementations treat payload signing as optional, either because it adds latency or because developers are not aware it needs to be explicitly enabled and enforced. The protocol does not enforce it by default.

The fix: Treat every inbound push task as untrusted by default, regardless of its apparent origin in the A2A directory. Require cryptographic payload signatures on all push tasks, validate those signatures against a key registry that is scoped to your tenant's authorized agent pool, and reject any push task whose signing key does not map to an agent explicitly authorized for the receiving tenant's pipeline.

Mistake #7: Delegating Tenant Isolation Responsibility to the Agent Runtime

Perhaps the most pervasive and most dangerous mistake is a philosophical one: the belief that the agent runtime framework sitting on top of A2A (whether that is a custom orchestration layer, an open-source agent framework, or a managed agentic platform) is responsible for tenant isolation, and that A2A itself will surface errors if something goes wrong at the boundary level.

A2A is a coordination protocol, not a security framework. It will not throw an exception when a tenant boundary is crossed. It will not log a warning when a task payload contains data belonging to a different tenant than the invoking agent. It will not refuse a task delegation because the sub-agent is in a different security domain. It will simply execute, because that is what it was designed to do: enable agents to work together efficiently.

Runtime frameworks built on top of A2A inherit this same characteristic unless their developers have explicitly built tenant isolation into the framework's core. Many popular open-source agent orchestration frameworks in 2026 still treat multi-tenancy as a deployment concern rather than a protocol-level concern, meaning they leave it entirely to the application developer to bolt on after the fact. When engineers assume the runtime handles it, and the runtime assumes the application handles it, the result is nobody handling it.

The fix: Establish an explicit Tenant Isolation Contract for every layer of your agentic stack. Document, in writing, which layer is responsible for which aspect of tenant boundary enforcement: the protocol layer, the runtime layer, the application layer, and the infrastructure layer. Audit each layer against that contract in your security review process. If a layer's responsibility is "not applicable," document why and confirm that an adjacent layer compensates.

The Deeper Problem: A2A's Design Philosophy vs. Enterprise Security Requirements

It is worth stepping back and being fair to Google's A2A team. The protocol was designed to solve a genuine and hard problem: enabling heterogeneous AI agents, built by different teams on different frameworks, to interoperate in a standardized way. It succeeded at that goal. The Agent Card system is elegant. The task lifecycle model is well-thought-out. The streaming support is genuinely useful for complex agentic workflows.

But A2A was designed with an open, collaborative agent ecosystem in mind, one closer in spirit to the open web than to an enterprise SaaS platform serving regulated industries. The web analogy is instructive: HTTP is also not a security protocol. It is a communication protocol. We built HTTPS, OAuth, CORS, CSP, and dozens of other security layers on top of HTTP because we learned, often painfully, that communication protocols and security protocols serve different purposes.

The industry is now in the early stages of that same learning curve with A2A. The engineers who will come out ahead are those who treat A2A as the HTTP of agent communication: a powerful, necessary foundation that requires deliberate, layered security engineering built on top of it, not assumed to be within it.

A Practical Checklist for Secure Multi-Tenant A2A Deployments

Agent Card authorization: Every A2A task invocation passes through a policy engine that validates tenant-scoped authorization before the task is dispatched.
Per-hop token validation: Short-lived, audience-restricted tokens issued via RFC 8693 token exchange are validated independently at every agent boundary.
Tenant-namespaced discovery: Agent discovery is scoped to tenant namespaces; cross-namespace federation requires explicit configuration and audit logging.
Streaming continuous authorization: Long-running streaming tasks perform periodic token introspection at defined checkpoints throughout the task lifecycle.
Push task signature enforcement: All inbound push tasks require cryptographic payload signatures validated against a tenant-scoped key registry.
Tenant Isolation Contract: A documented, audited contract assigns tenant boundary enforcement responsibility to specific layers of the stack with no gaps or assumed handoffs.
Cross-tenant blast radius testing: Regular red-team exercises specifically attempt cross-tenant A2A boundary violations as part of the security testing program.

Conclusion: The Protocol Is Not the Perimeter

In 2026, the pace of agentic AI adoption is outrunning the industry's collective security maturity around it. Google's Agent2Agent protocol is a genuine engineering achievement that is enabling a new generation of composable, interoperable AI systems. But its adoption in multi-tenant production environments has exposed a critical gap: backend engineers are treating a coordination protocol as a security standard, and tenant data is paying the price.

The seven mistakes outlined above are not theoretical. They are patterns emerging in production systems today, often invisible in normal operation and only surfaced during incidents, audits, or the kind of adversarial testing that most teams do not perform until after something goes wrong. The good news is that every one of them is fixable, not by abandoning A2A, but by building the security layers that A2A was never meant to provide on its own.

The protocol is not the perimeter. Your security architecture is. Build it accordingly.