Question 1

Why does Sandbox (Agent Execution Sandbox) matter?

Accepted Answer

**What are the four isolation dimensions of a sandbox? What does each dimension specifically restrict?**

A complete Agent sandbox isolates behavior across four dimensions:

**Dimension 1: Tool Call Restriction**
The Agent can only call explicitly whitelisted tool functions, not any code outside the sandbox. Implementation: in LangChain or Claude's Tool Use mechanism, pass only the tools the Agent's business needs — not 'debugging tools' or 'system management tools.' A DeFi yield optimization Agent needs only 'query APY' and 'execute rebalance' tools — not (and should not have) 'read server files' or 'send arbitrary HTTP requests.'

**Dimension 2: Network Access Control**
Network egress from the Agent execution environment is allowed only to whitelisted domains (Aave API, Compound API, Ethereum RPC nodes). Requests to arbitrary external URLs are blocked — preventing Prompt Injection from causing the Agent to send internal data (transaction records, key shards) to attacker-controlled servers. Implementation: set egress domain whitelist in Docker container or Cloud Run network configuration.

**Dimension 3: File System Isolation**
The Agent execution environment can only read specific directories needed for work (e.g., config files); reading system directories containing sensitive information (private keys, database passwords) is blocked. Implementation: run the Agent process as a non-root user, mount a read-only filesystem (except the Agent log directory).

**Dimension 4: Resource Quotas**
Limit the Agent process's CPU, memory, concurrent threads, and LLM API calls per minute. Prevents 'resource exhaustion attacks' — where Prompt Injection causes the Agent to enter infinite reasoning loops, consuming all compute resources until service crashes.

Question 2

How does Sandbox (Agent Execution Sandbox) work?

Accepted Answer

**What is a Sandbox Escape attack? What are the known escape vectors in Agent contexts?**

A sandbox escape occurs when an attacker exploits vulnerabilities in the sandbox implementation to cause the Agent to perform operations outside the sandbox boundary. In Agent contexts, the dangerous characteristic of sandbox escape is that attackers don't need direct access to the underlying system — they manipulate the Agent's LLM reasoning, causing the LLM to 'find' sandbox vulnerabilities on its own. Primary escape vectors:

**Vector 1: Tool Description Injection**
Through Prompt Injection, attackers cause the LLM to 'misunderstand' a tool's function — for example, making the LLM believe that the 'get_market_data' tool can actually be used to 'send arbitrary HTTP requests' (by modifying tool descriptions in the Agent's Context). If a tool's security boundary is maintained only by description text (not backend code), this vector is viable. Defense: tool security boundaries must be implemented in backend code, not relying solely on Tool Description text.

**Vector 2: Indirect Tool Chaining**
Attackers have the Agent combine calls to multiple permitted tools to achieve an effect no single tool would permit. Example: `read_config_file` and `append_to_log` are both permitted tools, but the attacker has the Agent first read a sensitive config file, then append the content to a log file (which the attacker can access externally). Defense: combined tool operations require semantic validation at the backend, not just per-tool parameter validation.

**Vector 3: Long-Context Memory Poisoning**
Through sustained small-step Prompt Injection over time, attackers gradually build a 'false belief system' in the Agent's Context (e.g., gradually convincing the Agent that a malicious address is 'an authorized whitelist address'), until accumulated false beliefs cause the Agent to voluntarily bypass whitelists and execute malicious operations. Defense: periodically clear and rebuild Agent Context, reloading from backend whitelist (code), not trusting 'whitelist descriptions' in Context.

Question 3

How is Sandbox (Agent Execution Sandbox) applied in practice?

Accepted Answer

**In Onchain Agents, how do sandbox and whitelist divide responsibilities? Why can't they substitute for each other?**

This is the most commonly confused concept in Agent security design. Sandbox and whitelist are two complementary protection layers, each defending against different attack surfaces:

**Whitelist answers the question**: 'Which addresses, protocols, and tokens is this Agent permitted to interact with?'
- Address whitelist: Agent can only send transactions to Aave, Morpho, Compound contract addresses — no transfers to arbitrary addresses
- Protocol whitelist: Agent can only call specific functions of whitelisted protocols (not arbitrary functions of protocol contracts)
- Token whitelist: Agent can only operate with USDC, USDT — not arbitrary ERC-20 tokens

Whitelists are 'business logic layer restrictions,' defining what business operations the Agent is permitted to perform.

**Sandbox answers the question**: 'What system operations are permitted in the Agent's execution environment?'
- Network whitelist: Agent execution environment can only access whitelisted domain APIs (cannot send requests to arbitrary URLs)
- Tool whitelist: Agent can only call specified tool function sets (cannot call arbitrary code)
- Resource limits: Agent's CPU/memory/network bandwidth is capped (prevents resource exhaustion attacks)

The sandbox is a 'system-layer restriction,' defining what environment the Agent is permitted to operate in.

**Why they can't substitute for each other**: attackers can exploit system-layer vulnerabilities without violating business whitelists (e.g., having the Agent use a legitimate 'query tool' to read sensitive config files, then use a legitimate 'log write tool' to exfiltrate the information). The sandbox prevents this class of attack at the system layer; whitelists cannot. Without either, the defense has blind spots.