Security boundaries in agentic architectures

Vercel News Feb 24, 2026

Most agents today run generated code with full access to your secrets.

As more agents adopt coding agent patterns, where they read filesystems, run shell commands, and generate code, they're becoming multi-component systems that each need a different level of trust.

While most teams run all of these components in a single security context, because that's how the default tooling works, we recommend thinking about these security boundaries differently.

Below we walk through:

  • The actors in agentic systems
  • Where security boundaries should go between them
  • An architecture for running agent and generated code in separate contexts

All agents are starting to look like coding agents

More agents are adopting the coding agent architecture. These agents read and write to a filesystem. They run bash, Python, or similar programs to explore their environment. And increasingly, agents generate code to solve particular problems.

Even agents that aren't marketed as "coding agents" use code generation as their most flexible tool. A customer support agent that generates and runs SQL to look up account data is using the same pattern, just pointed at a database instead of a filesystem. An agent that can write and execute a script can solve a broader class of problems than one limited to a fixed set of tool calls.

What goes wrong without boundaries

Consider an agent debugging a production issue. The agent reads a log file containing a crafted prompt injection.

The injection tells the agent to write a script that sends the contents of ~/.ssh and ~/.aws/credentials to an external server. The agent generates the script, executes it, and the credentials are gone.

This is the core risk of the coding agent pattern. Prompt injection gives attackers influence over the agent, and code execution turns that influence into arbitrary actions on your infrastructure. The agent can be tricked into exfiltrating data from the agent's own context, generating malicious software, or both. That malicious software can steal credentials, delete data, or compromise any service reachable from the machine the agent runs on.

The attack works because the agent, the code the agent generates, and the infrastructure all share the same level of access. To draw boundaries in the right places, you need to understand what these components are and what level of trust each one deserves.

Four actors in an agentic system

An agentic system has four distinct actors, each with a different trust level.

Agent

The agent is the LLM-driven runtime defined by its context, tools, and model. The agent runs inside an agent harness, which is the orchestration software, tools, and connections to external services that you build and deploy through a standard SDLC. You can trust the harness the same way you'd trust any backend service, but the agent itself is subject to prompt injection and unpredictable behavior. Information should be revealed on a need-to-know basis, i.e. an agent doesn't need to see database credentials to use a tool that executes SQL.

Agent secrets

Agent secrets are the credentials the system needs to function, including API tokens, database credentials, and SSH keys. The harness manages these responsibly, but they become dangerous when other components can access them directly. The entire architecture discussion below comes down to which components have a path to these secrets.

Generated code execution

The programs the agent creates and executes are the wildcard. Generated code can do anything the language runtime allows, which makes it the hardest actor to reason about. These programs may need credentials to talk to outside services, but giving generated code direct access to secrets means any prompt injection or model error can lead to credential theft.

Filesystem

The filesystem and broader environment are whatever the system runs on, whether a laptop, a VM, or a Kubernetes cluster. The environment can trust the harness, but it cannot trust the agent to have full access or run arbitrary programs without a security boundary.

These four actors exist in every agentic system. The question is whether you draw security boundaries between them or let them all run in the same trust domain.

A few design principles follow from these trust levels:

  • The harness should never expose its own credentials to the agent directly
  • The agent should access capabilities through scoped tool invocations, and those tools should be as narrow as possible. An agent performing support duties for a specific customer should receive a tool scoped to that customer's data, not a tool that accepts a customer ID parameter, since that parameter is subject to prompt injection.
  • Generated programs that need their own credentials are a separate concern, which the architectures below address

With these actors and principles in mind, here are the architectures we see in practice, ordered from least to most secure.

Zero boundaries: today's default

Coding agents like Claude Code and Cursor ship with sandboxes, but these are often off by default. In practice, many developers run agents with no security boundaries.

In this architecture, there are no boundaries between any of the four actors. The agent, the agent's secrets, the filesystem, and generated code execution all share a single security context. On a developer's laptop, that means the agent can read .env files and SSH keys. On a server, it means access to environment variables, database credentials, and API tokens. Generated code can steal any of these, delete data, and reach any service the environment can reach. The harness may prompt the user for confirmation before certain actions, but there is no enforced boundary once a tool runs.

Secret injection without sandboxing

A secret injection proxy sits outside the main security boundary and intercepts outbound network traffic, injecting credentials only as requests travel to their intended endpoint. The harness configures the proxy with the credentials and the domain rules, but the generated code never sees the raw secret values.

The proxy prevents exfiltration. Secrets can't be copied out of the execution context and reused elsewhere. But the proxy doesn't prevent misuse during active runtime. Generated software can still make unexpected API calls using the injected credentials while the system is running.

Secret injection is a backward-compatible path from a zero-boundaries architecture. You can add the proxy without restructuring how components run. The tradeoff is that the agent and generated code still share the same security context for everything except the secrets themselves.

Why sandboxing everything together isn't enough

A natural instinct is to wrap the agent harness and the generated code in a shared VM or sandbox. A shared sandbox isolates both from the broader environment, and that's genuinely useful. Generated programs can't infiltrate the wider infrastructure.

But in a shared sandbox, the agent and generated program still share the same security context. The generated code can still steal the harness's credentials or, if a secret injection proxy is in place, misuse credentials through the proxy. The sandbox protects the environment from the agent, but doesn't protect the agent from the agent's own generated code.

Separating agent compute from sandbox compute

The missing piece is running the agent harness and the programs the agent generates on independent compute, in separate VMs or sandboxes with distinct security contexts. The harness and the harness's secrets live in one context. The filesystem and generated code execution live in another, with no access to the agent's secrets.

Both Claude Code and Cursor offer sandboxed execution modes today, but adoption in desktop environments has been low because sandboxing can cause compatibility issues. In the cloud, this separation is more practical. You can give the generated code a VM tailored for the type of software the agent needs to run, which can actually improve compatibility.

In practice, this separation is a straightforward change. Agents perform tool invocations through an abstraction layer, and that abstraction makes it natural to route code execution to a separate environment without rewriting the agent itself.

These two workloads have very different compute profiles, which means separating them lets you optimize each one independently. The agent harness spends most of its time waiting on LLM API responses. On Vercel, Fluid compute is a natural fit for this workload because billing pauses during I/O and only counts active CPU time, which keeps costs proportional to actual work rather than billing idle time.

Generated code has the opposite profile. Agent-created programs are short-lived, unpredictable, and untrusted. Each execution needs a clean, isolated environment so that one program can't access secrets or state left behind by another. Sandbox products like Vercel Sandbox provide this through ephemeral Linux VMs that spin up per execution and are destroyed afterward. The VM boundary is what enforces the security context separation. Generated code inside the sandbox has no network path to the harness's secrets and no access to the host environment.

The sandbox works in both directions. The sandbox shields the agent's secrets from generated code, and shields the broader environment from whatever the generated code does.

Application sandbox with secret injection

The strongest architecture combines the application sandbox with secret injection. The combination gives you two properties that neither achieves alone:

  • Full isolation between the agent harness and generated programs, each running in their own security context
  • No direct access to credentials for the generated code, which can use secrets through the injection proxy while running but can't read or exfiltrate them. Injected headers overwrite any headers the sandbox code sets with the same name, preventing credential substitution attacks.

For production agentic systems, we recommend combining both. The agent harness runs as trusted software on standard compute. Generated code runs in an isolated sandbox. Secrets are injected at the network level, never exposed where generated code could access the secrets directly.

This separation of agent compute from sandbox compute will become the standard architecture for agentic systems. Most teams haven't made this shift yet because the default tooling doesn't enforce it. The teams that draw these boundaries now will have a meaningful security advantage as agents take on more sensitive workloads.

Safe secret injection is now available on Vercel Sandbox.

Read more

link to the original content