How to Build a Zero-Trust API Gateway for AI Agent-to-Agent Communication: A Backend Engineer's Complete Guide
Here is a scenario that should keep you up at night: your carefully orchestrated multi-agent AI system is humming along in production. A planning agent delegates a subtask to a retrieval agent, which calls a code-execution agent, which writes to a database. Everything looks fine from the outside. But somewhere in that chain, one compromised or misconfigured agent is quietly exfiltrating data, escalating its own privileges, or poisoning shared context windows, and your perimeter firewall never saw a thing.
Welcome to the defining security challenge of 2026. As multi-agent architectures built on frameworks like LangGraph, AutoGen, CrewAI, and the emerging Agent Protocol standard have moved from research curiosity to production backbone, the attack surface has exploded in a direction most security teams were not watching: lateral movement between trusted agents. The threat is not always an external attacker. Often it is a prompt-injected agent, a dependency-poisoned tool, or simply an over-privileged service identity doing exactly what it was (incorrectly) authorized to do.
This guide walks you through building a Zero-Trust API Gateway specifically designed for agent-to-agent (A2A) communication. We will cover workload identity issuance, mutual TLS (mTLS) handshakes, Open Policy Agent (OPA) runtime policy evaluation, least-privilege scoping, and real-time audit logging. By the end, you will have a concrete, deployable architecture that treats every agent call as untrusted by default, regardless of where it originates inside your system.
Why Standard API Gateways Fall Short for Multi-Agent Systems
Traditional API gateways were designed for a world of human-initiated requests and relatively static service meshes. They authenticate users or services at the edge and then largely trust traffic flowing internally. Multi-agent systems break every assumption that model was built on.
- Dynamic identity proliferation: In a typical agentic workflow, dozens of ephemeral agent instances may spin up and tear down within seconds. Static API keys and long-lived service accounts cannot keep pace.
- Chained delegation: Agent A calls Agent B on behalf of a user. Agent B calls Agent C. By the time you reach Agent C, the original authorization context is often completely lost or implicitly trusted.
- Tool-call ambiguity: Agents invoke tools (web search, code execution, database reads) as first-class operations. Most gateways have no concept of a "tool call" as a distinct authorization primitive.
- Prompt injection as a privilege escalation vector: A malicious payload in retrieved content can instruct an agent to call APIs it should never touch. Without runtime policy enforcement, the agent will happily comply.
Zero Trust, formalized in NIST SP 800-207 and extended by CISA's 2025 maturity model, gives us the right mental model: never trust, always verify, enforce least privilege at every hop. The challenge is applying it to a system where the "hops" are AI inference calls happening in milliseconds.
The Architecture at a Glance
Before diving into implementation, here is the high-level blueprint. Every component serves a specific security function.
- SPIFFE/SPIRE: Issues short-lived X.509 SVIDs (SPIFFE Verifiable Identity Documents) to each agent workload. This is your identity plane.
- Envoy Proxy (sidecar): Terminates and initiates mTLS on every agent-to-agent connection. No plaintext traffic, ever.
- Custom Zero-Trust Gateway (your code): A lightweight middleware layer that intercepts every A2A request, extracts the caller's SVID, and forwards a structured authorization query to OPA.
- Open Policy Agent (OPA): Evaluates Rego policies at runtime to decide allow/deny based on caller identity, target resource, action type, time of day, and behavioral anomaly signals.
- Audit Log Sink (OpenTelemetry + SIEM): Every decision, allow or deny, is emitted as a structured trace event. No silent failures.
Step 1: Issue Workload Identities with SPIFFE/SPIRE
The first principle of Zero Trust is: you cannot authorize what you cannot identify. Forget API keys for agent workloads. They get rotated poorly, leaked into logs, and shared across instances. Instead, use SPIFFE (Secure Production Identity Framework for Everyone), which was designed exactly for this problem.
Install SPIRE server and agent on your cluster. Each agent workload receives a SPIFFE ID in the format spiffe://your-domain/agent/retrieval-agent and a corresponding X.509 SVID that expires in 1 hour by default. SPIRE automatically rotates these certificates. Your agents never manage secrets manually.
SPIRE Registration Entry (Kubernetes)
spire-server entry create \
-spiffeID spiffe://agents.example.com/retrieval-agent \
-parentID spiffe://agents.example.com/k8s-node \
-selector k8s:pod-label:app:retrieval-agent \
-ttl 3600
Do this for every agent type in your system: planner, retriever, code-executor, summarizer, and so on. Each gets a unique, verifiable identity that is cryptographically bound to its workload, not to a human who might share it.
Step 2: Enforce Mutual TLS at Every Agent Connection
Once every agent has an SVID, configure Envoy as a sidecar proxy to enforce mTLS on all inbound and outbound connections. The key difference from standard TLS is that both sides present certificates. The calling agent proves who it is, and the receiving agent proves who it is. Neither side can spoof the other.
Envoy Listener Configuration (Simplified)
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
require_client_certificate: true
common_tls_context:
tls_certificate_sds_secret_configs:
- name: "spiffe://agents.example.com/retrieval-agent"
sds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: spire_agent
combined_validation_context:
default_validation_context:
match_typed_subject_alt_names:
- san_type: URI
matcher:
prefix: "spiffe://agents.example.com/"
This configuration does three things simultaneously: it terminates incoming TLS, it validates the client's SVID against your SPIRE trust bundle, and it rejects any connection from a workload whose SPIFFE ID does not match your domain. An agent that cannot present a valid SVID simply cannot open a TCP connection to another agent. Full stop.
Step 3: Build the Zero-Trust Gateway Middleware
mTLS tells you who is calling. It does not tell you whether they are allowed to do what they are asking. That is the gateway's job. Build a lightweight HTTP middleware (here shown in Python using FastAPI, but the pattern applies to Go, Node.js, or Rust equally well) that sits in front of every agent's API surface.
Gateway Middleware (Python / FastAPI)
import httpx
from fastapi import FastAPI, Request, HTTPException
from cryptography import x509
import json
app = FastAPI()
OPA_URL = "http://opa:8181/v1/data/agents/authz/allow"
@app.middleware("http")
async def zero_trust_policy_middleware(request: Request, call_next):
# 1. Extract the verified SPIFFE ID from the mTLS client cert
# (Envoy forwards this as a header after mTLS termination)
caller_spiffe_id = request.headers.get("x-forwarded-client-cert")
if not caller_spiffe_id:
raise HTTPException(status_code=401, detail="No client certificate presented")
# 2. Build the OPA input document
opa_input = {
"input": {
"caller": caller_spiffe_id,
"method": request.method,
"path": str(request.url.path),
"action": request.headers.get("x-agent-action", "unknown"),
"target_resource": request.headers.get("x-target-resource", "unknown"),
"timestamp": request.headers.get("x-request-timestamp"),
}
}
# 3. Ask OPA for a policy decision
async with httpx.AsyncClient() as client:
opa_response = await client.post(OPA_URL, json=opa_input, timeout=0.05)
decision = opa_response.json().get("result", False)
# 4. Emit audit event regardless of decision
await emit_audit_event(opa_input["input"], decision)
if not decision:
raise HTTPException(status_code=403, detail="Policy denied by Zero Trust gateway")
return await call_next(request)
Notice the 50-millisecond OPA timeout. In a real-time agentic workflow, policy evaluation must be fast. OPA is designed for this; its Rego engine evaluates most policies in under 1ms when run in-process or as a local sidecar. Do not call a remote policy service over the public internet. Co-locate OPA with your gateway.
Step 4: Write Least-Privilege Rego Policies in OPA
This is where the real security work happens. Your Rego policies define the exact permission matrix for your multi-agent system. The goal is to express the minimum set of capabilities each agent legitimately needs and deny everything else by default.
Base Policy: Default Deny
package agents.authz
# Default deny
default allow = false
Start here. Every permission must be explicitly granted. Nothing is allowed by default.
Permission Matrix Policy
package agents.authz
import future.keywords.in
# Define the allowed action matrix per agent identity
allowed_actions := {
"spiffe://agents.example.com/planner-agent": {
"retrieval-agent": ["search", "fetch"],
"summarizer-agent": ["summarize"]
},
"spiffe://agents.example.com/retrieval-agent": {
"database-agent": ["read"],
},
"spiffe://agents.example.com/code-executor-agent": {
"sandbox-agent": ["execute"]
# Note: code-executor can NEVER call database-agent directly
}
}
allow {
caller_actions := allowed_actions[input.caller]
target_actions := caller_actions[input.target_resource]
input.action in target_actions
}
# Time-based restriction: no agent calls allowed between 02:00-04:00 UTC (maintenance window)
allow {
false # Override: deny during maintenance window check (implement via time.clock())
}
This policy encodes a critical architectural decision: the code-executor agent can never directly call the database agent. This is your blast radius containment. Even if a prompt injection attack instructs the code-executor to exfiltrate data by calling the database directly, OPA will deny it. The attacker would need to compromise both the code-executor and the retrieval agent to reach the database, a much harder task.
Behavioral Anomaly Policy
package agents.authz
# Deny if call rate exceeds threshold (fed from external data via OPA bundle)
allow {
not is_rate_anomaly
}
is_rate_anomaly {
call_counts := data.runtime.agent_call_counts[input.caller]
call_counts > data.policy.rate_limits[input.caller]
}
OPA supports external data bundles that you can push from your observability stack. Feed real-time call-rate counters from Prometheus into OPA's data plane. If a retrieval agent suddenly starts making 500 calls per minute instead of its normal 10, the policy denies further calls and your alerting fires. This is runtime behavioral enforcement, not just static rules.
Step 5: Implement Delegation Chains with Bounded Context Propagation
One of the most underappreciated attack surfaces in multi-agent systems is unbounded delegation. Agent A is authorized to call Agent B. Agent B calls Agent C. Agent C calls Agent D. By the time you reach D, no one is checking whether the original authorization scope from the user's session actually permits this chain.
The solution is to propagate a bounded delegation token through the chain and enforce scope shrinkage at every hop. Implement this as a signed JWT that travels as a request header.
Delegation Token Structure
{
"iss": "spiffe://agents.example.com/planner-agent",
"sub": "user-session-abc123",
"scope": ["search", "summarize"],
"delegation_depth": 2, // Max hops remaining
"max_delegation_depth": 2,
"iat": 1740000000,
"exp": 1740003600, // 1 hour TTL
"allowed_targets": [
"spiffe://agents.example.com/retrieval-agent",
"spiffe://agents.example.com/summarizer-agent"
]
}
Your gateway middleware validates this token at every hop. When Agent B receives a token with delegation_depth: 2, it may issue a new token to Agent C with delegation_depth: 1. Agent C cannot delegate further. The scope can only shrink, never expand. Add a Rego rule that enforces this:
package agents.authz
# Delegation depth must be positive
allow {
token := io.jwt.decode(input.delegation_token)
token[1].delegation_depth > 0
token[1].delegation_depth <= token[1].max_delegation_depth
}
# Scope can only shrink through the chain
allow {
requested_scope := input.action
token := io.jwt.decode(input.delegation_token)
requested_scope in token[1].scope
}
Step 6: Build a Real-Time Audit Trail with OpenTelemetry
Zero Trust without comprehensive audit logging is just a locked door with no security camera. Every policy decision, every allowed call, every denied call, must produce a structured trace event. Use OpenTelemetry (OTel) as your instrumentation layer so the data flows into whatever SIEM or observability platform your organization uses.
Audit Event Emitter
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
tracer = trace.get_tracer("zero-trust-gateway")
async def emit_audit_event(input_context: dict, decision: bool):
with tracer.start_as_current_span("a2a.policy.decision") as span:
span.set_attribute("agent.caller", input_context["caller"])
span.set_attribute("agent.target", input_context["target_resource"])
span.set_attribute("agent.action", input_context["action"])
span.set_attribute("policy.decision", "allow" if decision else "deny")
span.set_attribute("request.path", input_context["path"])
span.set_attribute("request.timestamp", input_context["timestamp"])
if not decision:
span.set_attribute("security.alert", True)
span.set_attribute("security.alert_reason", "policy_denied")
Ship these spans to your SIEM. Configure alerts for: more than 5 consecutive denials from the same agent identity (possible compromise), any attempt by an agent to call a target not in its allowed matrix (misconfiguration or attack), and delegation tokens with tampered scope claims (active exploitation attempt).
Step 7: Harden the Gateway Against Prompt Injection Escalation
Prompt injection is the unique threat that makes AI agent security different from traditional service mesh security. A malicious string in a retrieved document can instruct an agent to call /admin/delete-all. Your gateway needs one more layer: action validation against a canonical action registry.
Maintain a registry of every valid action any agent is permitted to invoke, expressed as an allowlist. If the action extracted from the request does not appear in the registry, deny it unconditionally, regardless of what the policy matrix says.
package agents.authz
valid_actions := {
"search", "fetch", "summarize", "read",
"execute", "write", "classify", "embed"
}
# Deny any unregistered action outright
allow {
input.action in valid_actions
}
deny_unregistered {
not input.action in valid_actions
}
This single rule eliminates an entire class of prompt injection attacks. An injected instruction to call action: "drop_table" or action: "exfiltrate_to_external_url" will fail at the gateway before it ever reaches any downstream agent or tool.
Putting It All Together: The Request Lifecycle
Here is the complete flow for a single agent-to-agent request through your Zero Trust gateway:
- Agent A (planner) initiates a call to Agent B (retrieval). Envoy sidecar wraps the connection in mTLS using Agent A's SVID.
- Envoy on Agent B's side terminates mTLS, validates Agent A's SVID against the SPIRE trust bundle, and forwards the verified SPIFFE ID as a request header.
- The Zero-Trust Gateway middleware intercepts the request, extracts the SPIFFE ID, the delegation token, the action, and the target resource.
- OPA evaluates the policy in under 1ms: Is the caller identity valid? Is the action in the allowlist? Does the permission matrix permit this caller-target-action triple? Is the delegation depth valid? Is the call rate within bounds?
- If OPA says allow: the request proceeds to Agent B's handler. An audit trace is emitted.
- If OPA says deny: the gateway returns HTTP 403, emits a security alert trace, and Agent A's workflow is interrupted. No data moves.
Common Pitfalls to Avoid
- Skipping mTLS for internal traffic: "It's all inside the cluster" is the most dangerous assumption in distributed systems. Treat your Kubernetes pod network as hostile. It is.
- Using long-lived credentials for agent identities: A 90-day API key for a code-execution agent is a 90-day blast radius. Use SPIRE SVIDs with 1-hour TTLs.
- Writing OPA policies that default to allow: Always start with
default allow = false. Explicit grants only. - Ignoring delegation chain depth: An unbounded delegation chain is a privilege escalation waiting to happen. Enforce
max_delegation_depthat the policy layer. - Not alerting on denials: A deny that nobody sees is just a silent failure. Wire every denial to your incident response pipeline.
- Treating the action field as trusted input: If the action string comes from an agent that might be prompt-injected, validate it against a canonical registry before it reaches OPA.
Conclusion: Your Multi-Agent System Is a Distributed Trust Problem
The excitement around multi-agent AI in 2026 is well deserved. These systems can accomplish genuinely remarkable things. But every new capability an agent gains is also a new capability an attacker can abuse if that agent is compromised, misconfigured, or prompt-injected. The blast radius of a single vulnerable agent in an unrestricted mesh is your entire system.
Zero Trust is not a product you buy. It is an architectural discipline you build, one verified hop at a time. By combining SPIFFE/SPIRE for workload identity, mTLS for transport security, OPA for runtime policy evaluation, bounded delegation tokens for chain integrity, and OpenTelemetry for full audit visibility, you can give your multi-agent system the security posture it deserves: one where every agent must continuously prove its right to act, and where no single compromised component can silently become an insider threat.
Start with Step 1 today. Issue identities to your agents. Everything else builds from there. The architecture described here is not theoretical; every component is production-ready, open source, and battle-tested in enterprise environments. The only thing standing between your multi-agent system and this level of security is the decision to build it.
Your agents are powerful. Make sure only the right ones can act on that power.