Bible Network Crypto DeFi Onchain RWA AI Agent Stablecoin Chain SAFU CryptoTax DeFAI AGI Claude Me Claude Skill Claude Design Claude Cowork
Independent Media
Not affiliated with any project
Deconstructing Autonomous Agents in Crypto
aiagent-bible.com
LATEST
Onchain Agent Worst-Case Defense Design: If Your Agent Is Fully Compromised, How to Keep Losses Within Acceptable Range  ·  How to Choose a Crypto AI Agent Service: Five Evaluation Frameworks to Avoid Marketing Traps  ·  Crypto Agent Pre-Launch Security Checklist: 12 Mandatory Items from Testnet to Mainnet  ·  How to Design an Agent Wallet: Complete Risk and Cost Comparison of Four Architectures  ·  AutoGen vs LangChain vs ElizaOS: Which Framework to Choose — A Complete Decision Guide for Crypto AI Agent Developers  ·  Agent Memory System Design: Three-Layer Architecture of Short-Term, Long-Term, and Semantic Retrieval, and Security Boundaries for Crypto Contexts
risk

Onchain Agent Worst-Case Defense Design: If Your Agent Is Fully Compromised, How to Keep Losses Within Acceptable Range

30-Second Version · For the impatient
'How do I prevent my Agent from being attacked?' is the wrong question. The right question is: 'If all of the Agent's defenses fail, what's the worst an attacker can do?' If the answer is 'take all my assets,' security design isn't complete. The correct answer should be: 'A few days of working capital from the operations wallet, with complete logs enabling post-incident root cause tracing.'

Full Content +

The security design question most people ask about crypto AI Agents is 'how do I prevent my Agent from being attacked?' That's the wrong question. The right question is: 'If all of the Agent's defenses fail — Prompt Injection succeeds, MCP Server is poisoned, LLM reasoning is fully hijacked — what's the worst an attacker can do?' If you can't answer this question clearly, your Agent security design isn't complete, no matter how well your System Prompt is written. This article starts from 'worst case' — the design goal isn't to make attacks impossible, but to keep the consequences of a successful attack within your acceptable range.

Asking the Right Question Produces Good Defense Design

Traditional security design thinking is 'lock the doors' — add more security checks, make attacks harder to succeed. This thinking isn't enough for crypto Agents, because the possibility of attack success always exists: LLM Prompt Injection has no 100% defense, MCP Server social engineering is hard to fully prevent, you can't guarantee every tool vendor you use hasn't been compromised. Crypto Agent security design should start from a 'Defense in Depth' architectural mindset: assume some defense layers will definitely fail, ensure there's another layer beyond each one, and that the worst consequences of each layer failing are acceptable. Specifically, for every Agent with fund operations, you should be able to answer five questions: how much money is in the Agent operations wallet at most, and could you accept losing it? Is there any way the Agent can transfer to addresses outside the whitelist without your knowledge? If the Agent's LLM reasoning is fully compromised, what's the maximum authorization amount? Can your logging system let you trace the attack path after the fact?

First Line of Defense: Minimum Necessary Operational Authorization

Minimum necessary authorization is the foundation of the entire defense system — even if all other defenses fail, this layer determines the ceiling of 'what attackers can get.' Design principles: Agent operations wallet holds only a few days of working capital — equal to 'an amount whose complete loss wouldn't cause you financial stress.' Agent only needs enough 'oil' to execute a few operations; most funds stay in the primary wallet the Agent cannot directly access. ERC-20 approvals precisely limited — set specific maximum approve amounts per protocol and token, no unlimited authorization. Monthly review and revocation of unused approvals. Even if the Agent's LLM is compromised, the amount it wants to transfer is hard-constrained by the approve limit. Operation type whitelist — Agent's callable tools must be precisely limited to the minimum set its task requires. Remove unnecessary tools from the tool list — don't leave them there 'just in case.'

Second Line of Defense: Read/Write Isolation and Independent Confirmation Channel

Even if attackers successfully contaminate the Agent's LLM reasoning, read/write isolation ensures contaminated reasoning cannot directly trigger fund operations. Read tools and write tools execute in strictly isolated environments — read tools can run after contact with any external data (worst case: read wrong information); write tools only run in a 'clean' execution environment not touching unvalidated external data. LangGraph's DAG design makes this isolation natural: read nodes and write nodes at different graph nodes. All write operations have backend second-layer parameter validation — in the tool function backend implementation (not the LLM-visible description layer), hard validate every write operation parameter: amount within limits, target address/protocol in whitelist, operation type permitted. These validation rules are in Python/JavaScript code, not in the System Prompt — System Prompts can be overridden by Prompt Injection; code-level validation cannot. Independent confirmation channel for high-value operations — any write operation above your threshold (e.g., $100) requires confirmation via a channel completely independent of the Agent's LLM reasoning flow (Telegram Bot notifies you, waits for your reply, then executes). Even if all Agent LLM reasoning is compromised, attackers cannot bypass this — the confirmation request goes to your phone, not to the LLM.

Third Line of Defense: Circuit-Breakers and Complete Logs

Circuit-breakers assume 'the Agent is already doing something abnormal — how to automatically stop losses from expanding.' Daily spend limit circuit-breaker: maintain a daily cumulative spend counter in backend logic (Gas fees + A2A payment fees + fund operation amounts). Exceeding the daily limit automatically pauses all write operations, sends emergency notification, waits for your manual reset. This counter is in backend code, not in the LLM's Context — the LLM cannot read or modify it. Market anomaly circuit-breaker: set market anomaly conditions (assets drop over X% in 15 minutes, Gas exceeds 10x normal, DEX slippage exceeds set limit). Any trigger automatically pauses write operations — prevents the Agent from executing strategies designed for normal markets during black swan events. Complete four-layer operation logs: LLM reasoning logs, tool call logs, decision authorization logs, on-chain execution logs. Encrypted storage, minimum 90-day retention.

Emergency Response Flow After Attack

Post-attack emergency response must be designed in advance. Standard emergency response flow: Step 1 (0–5 minutes): confirm Agent is executing unexpected operations, immediately stop Agent service process — the fastest 'emergency brake,' preventing new transaction broadcasts. Step 2 (5–15 minutes): revoke all ERC-20 approvals from the Agent operations address to all protocols. Even with service stopped, approvals remaining let attackers call contracts to transfer tokens directly. Call `approve(agentAddress, 0)` for each token contract from your primary wallet. Step 3 (15–60 minutes): save all logs to isolated storage (preventing log clearing), begin root cause analysis — identify which Thought step the Agent started showing anomalies, which tool returned anomalous data, the attack entry point. Step 4: fix security vulnerabilities, fully replay the attack path on testnet to confirm the fix works, then redeploy.

What This Means for Your Money

The goal of defense in depth isn't to make the Agent system 'impossible to attack' — that's unrealistic. The realistic goal is: if an attack succeeds, the attacker can get at most a few days of working capital from your operations wallet (not all your assets), and you have complete logs to trace the attack path and root cause after the fact. A well-designed Onchain Agent system should let you confidently answer: 'If my Agent is fully compromised today, I can still run this business tomorrow.' If you can't answer that way, security design isn't complete.

Diagram
Three-Layer Defense in Depth: Onchain Agent Worst-Case Architecture縱深防禦三層架構圖:第一層(最小授權 + 白名單)→ 第二層(讀寫隔離 + 後端驗證 + 獨立確認通道)→ 第三層(熔斷機制 + 四層日誌),每層失效後下一層仍然有效。Onchain Agent Defense in Depth: Three-Layer ArchitectureLayer 1: Min AuthorizationOperations wallet = few days only5–10% of strategy funds · isolatedERC-20 approve limitsNo unlimited · monthly revoke reviewOperation type whitelistOnly tools the task requiresIf Layer 1 fails → attacker limitedto operations wallet balance onlyLayer 2: R/W Isolation + ConfirmRead/write tool isolationRead nodes ≠ write nodes · DAGBackend parameter validationIn code, not in System PromptIndependent confirm channelTelegram gate · outside LLM loopIf Layer 2 fails → high-value opsstill blocked by human gateLayer 3: Circuit-Breakers + LogsDaily spend circuit-breakerBackend counter · LLM cannot seeMarket anomaly circuit-breakerPrice drop · Gas spike · Slippage4-layer operation logsLLM · Tools · Auth · Chain · 90dEven if breached: logs enablefull post-incident root cause tracefailsfailsAI Agent Bible · aiagent-bible.com
Feel free to share. Please credit the source.
Ask a Question
Please enter at least 10 characters
Related Terms
Related Articles
Front-Running Your Agent: When MEV Bots Target AI Agent Trades, the Losses Can Be Worse Than When They Target You
risk · Jun 15
How to Choose a Crypto AI Agent Service: Five Evaluation Frameworks to Avoid Marketing Traps
beginners · Jun 22
Crypto Agent Pre-Launch Security Checklist: 12 Mandatory Items from Testnet to Mainnet
developers · Jun 22
Tool Use Mechanism Complete Breakdown: How AI Agents 'Act,' and Why This Design Determines Whether They Can Be Trusted
fundamentals · Jun 17
Related News