developers

Multi-Agent System Architecture: A Complete Breakdown of the Orchestrator + Sub-agent Pattern and Security Boundary Design for Crypto Contexts

30-Second Version · For the impatient

The most common multi-Agent design mistake: the execution Agent (with on-chain signing rights) does whatever the Orchestrator says. Trust must not propagate — a compromised Orchestrator can direct the execution Agent to transfer all your funds. Execution Agents need independent verification; high-risk operations must go directly to you.

Priya Sharma · June 15, 2026

Full Explanation +

01 · Why did this happen?

What protocols should Orchestrator and Sub-agent communication use in a crypto multi-agent system? How do A2A and MCP divide responsibilities?

This is a question that's easy to confuse in design. The answer: A2A and MCP solve different layers of communication problems.

MCP (Model Context Protocol) solves communication between Agents and "tools" (external services, APIs, databases). For example, when a Data Sub-agent needs to query Aave's interest rates, it calls the Aave data API through an MCP Server.

A2A (Agent-to-Agent) solves communication between Agents. When the Orchestrator wants to assign a task to the Analysis Sub-agent, it sends task descriptions and required input data via A2A; when the Analysis Sub-agent finishes, it returns results to the Orchestrator via A2A.

In crypto contexts, A2A communication needs specially designed security mechanisms: authorization instructions the Orchestrator sends to execution Sub-agents should include cryptographic signatures (signed with the Orchestrator's private key). The execution Sub-agent verifies the signature is valid before executing. This way, even if an injected Agent tries to impersonate the Orchestrator, without the private key it cannot generate a valid authorization signature.

Currently A2A as a standard is still maturing. Different frameworks (LangChain, AutoGen, ElizaOS) have varying multi-agent communication implementations, and in cross-framework multi-agent systems, communication format compatibility is an additional engineering problem that needs to be handled.

02 · What is the mechanism?

How does an Orchestrator decide which Sub-agent to assign tasks to? How much intelligence does this decision itself require?

Task routing is one of the Orchestrator's most core capabilities. There are several different complexity levels in design:

The simplest is rules-driven routing: the Orchestrator maps task types to Sub-agents based on keywords or categories. For example "any task containing 'analyze' → Analysis Sub-agent," "any task containing 'execute' → Execution Sub-agent." This approach has low computational overhead and high predictability, but poor flexibility — it breaks down with ambiguous task descriptions.

Medium complexity is LLM-driven routing: the Orchestrator uses LLM reasoning to determine which Sub-agent to assign a task to, and can split one task into multiple sub-tasks for parallel assignment. This is flexible, but adds latency and token costs, and the Orchestrator's LLM itself can be affected by Prompt Injection influencing routing decisions.

High complexity is planning generation: the Orchestrator first generates a complete multi-step plan (like a longer-horizon ReAct Thought), then progressively assigns and tracks tasks according to the plan. Suitable for complex multi-step tasks, but highest design difficulty and computational cost.

In crypto contexts: recommended approach is rules-driven routing for "high-risk operation tasks" (anything involving fund movement) — high determinism, auditable — and LLM-driven routing for "analytical tasks" — flexible, with low consequence if wrong. This controls risk while maintaining flexibility.

03 · How does it affect me?

How does failure propagate in a multi-agent system? How can one Sub-agent's error affect the entire system?

Error propagation is one of the hardest parts to design in multi-agent systems. Unlike single-agent systems, errors in multi-agent systems can spread and amplify in counterintuitive ways.

Common error propagation patterns:

First, data contamination propagation: the Data Sub-agent returns incorrect data (possibly due to MCP Server poisoning or API failure); the Analysis Sub-agent generates a seemingly reasonable analysis based on bad data; the Orchestrator trusts this analysis and authorizes the Execution Sub-agent to act on incorrect information. Every link individually "functions correctly" but the overall result is wrong.

Second, overconfidence propagation: the Analysis Sub-agent generates a recommendation with highly confident language ("strongly recommend immediate rebalancing, 95% confidence"). The Orchestrator's routing logic skips the risk evaluation step due to the high confidence and directly authorizes execution. In reality that "95% confidence" is just the LLM's rhetorical style, not a genuine statistical confidence level.

Third, repeated retry propagation: the Execution Sub-agent tries to execute an operation, but it fails because gas fees are insufficient or contract conditions aren't met. The Orchestrator keeps retrying, consuming large amounts of tokens and gas fees, until the entire task times out.

Defense design: each Sub-agent's returned results should include confidence levels and data sources; the Orchestrator should have failure counters and circuit-breaker mechanisms (stop after N failures and notify the user); data on critical paths must be cross-validated from at least two independent sources.

04 · What should I do?

Are there mature open-source frameworks supporting the construction of multi-agent crypto systems? What are the pros and cons of each?

Several major multi-agent frameworks and their suitability for crypto contexts:

LangGraph (LangChain's multi-agent extension): defines inter-agent workflows using directed acyclic graphs (DAGs), with each node being an Agent or tool call and edges defining dependencies and trigger conditions. Advantage: excellent process visualization and controllability, ideal for scenarios requiring precise definition of Agent interaction flows. Disadvantage: relatively lower flexibility — complex dynamic task assignment needs additional design. Crypto multi-agent suitability: high, especially for strategy systems requiring clearly defined execution paths.

AutoGen (Microsoft): dialogue-based multi-agent framework where Agents communicate via natural language messages. Advantage: low design barrier, suitable for rapid prototyping. Disadvantage: natural-language-based communication is more vulnerable to Prompt Injection; the security requirements for crypto asset management need additional strengthening.

CrewAI: high-level multi-agent framework emphasizing Role and Goal definitions. Advantage: intuitive concepts. Disadvantage: limited support for fine-grained security controls — in crypto contexts requiring strict minimum-permission design, heavy customization is needed.

Most practical recommendation for crypto multi-agent systems: use LangGraph to define high-security-requirement execution paths (Execution Agent trigger conditions and authorization flows), and AutoGen or custom A2A for analytical Agent communication. Combine both rather than using a single framework for everything.

Full Content +

A single Agent has inherent limits — bounded context window, limited tool complexity, and degraded reasoning quality when overloaded with too many tasks. Multi-Agent Systems address this: decompose complex tasks into sub-tasks, assign them to specialized Sub-agents, and have an Orchestrator coordinate the overall flow. In crypto, this pattern has unique design challenges — Sub-agents may have on-chain operation permissions, and poorly coordinated handoffs can mean direct asset loss.

Why Multi-Agent Instead of One Stronger Agent

Intuitively, "have one smarter Agent do everything" sounds simpler. But this direction hits three fundamental limits in real deployment. Context Window limits: even the largest models have ceilings of a few hundred thousand tokens. An Agent simultaneously analyzing real-time data from 10 DeFi protocols, querying 30 days of transaction history, evaluating 5 candidate strategies, and making decisions will easily overflow the context window. Splitting tasks across Sub-agents — each responsible for a small portion — dissolves this problem. Specialization quality gains: an Agent asked to simultaneously do on-chain data analysis, community sentiment evaluation, and trade execution performs worse than three Sub-agents each fully focused on one domain. Specialization lets each Sub-agent's System Prompt be very precise, raising reasoning quality. Parallel execution: in single-Agent architecture, tasks are serial — A must complete before B can start. Multi-Agent architecture lets the Orchestrator dispatch multiple Sub-agents simultaneously, dramatically reducing total wall-clock time for complex tasks.

Core Orchestrator + Sub-agent Design

The basic architecture splits into two layers. Orchestrator (coordination Agent): receives the user's goal or high-level instruction, decomposes tasks into sub-tasks, decides which Sub-agent each goes to, collects returned results, and integrates them to make final decisions or report to the user. The Orchestrator typically executes nothing directly — it's the "commander," not the "executor." Sub-agents (execution Agents): each Sub-agent focuses on one specific responsibility domain, equipped with only the tool set needed for that domain. Examples: a data collection Sub-agent (read-only, no write permissions), a strategy analysis Sub-agent (reasoning and recommendations only, no execution), and an execution Sub-agent (on-chain permissions, but only executes operations the Orchestrator explicitly authorizes). The critical design principle: each Sub-agent's tool set must be minimum necessary permissions. A Sub-agent that only needs to read on-chain data should never have transaction signing capability — even if an attacker successfully injects it, they can only read data, not move funds.

Crypto-Specific Design Challenges

In crypto Agent systems, multi-Agent architecture faces special challenges absent from ordinary Web2 scenarios. Trust propagation problem: the Orchestrator sends instructions to Sub-agents, and Sub-agents execute. But if the Orchestrator itself is hit by a Prompt Injection attack, the "execute transaction" instructions it issues may already be tampered. How does an execution Sub-agent verify that instructions it receives are genuinely from an authorized Orchestrator, not malicious post-injection commands? One possible solution: design execution Sub-agents (with on-chain signing authority) to only accept instructions from specific addresses or with specific cryptographic signatures, not from anything merely claiming to be the Orchestrator. Loop attack problem: a compromised Sub-agent A may attempt to influence Sub-agent B via A2A communication, making B do what A was designed to prevent. For example: a data Sub-agent injected with malicious instructions, impersonating the Orchestrator to send the execution Sub-agent "transfer all funds immediately." Defense: execution Sub-agents need independent verification mechanisms and can't fully trust any other Agent — including the Orchestrator. Operations above a threshold must have an independent confirmation channel that notifies the real user directly rather than relaying through other Agents. State consistency problem: multiple Sub-agents operating in parallel may encounter resource contention — A and B simultaneously try to use USDC from the Agent wallet, but the balance only supports one operation. Clear state management mechanisms are needed to ensure Sub-agents don't issue conflicting operations.

Real Architecture Example: Crypto DeFi Management Multi-Agent System

A complete DeFi management multi-Agent system might include the following Sub-agents. A data collection Agent (read-only) continuously scans interest rates, liquidity depth, and Gas fees across DeFi protocols — no write or signing permissions. An analysis Agent (reasoning only) receives data Agent reports and uses LLM to reason and evaluate strategy opportunities — outputs are recommendation text only, executes no operations. A risk evaluation Agent (read + reasoning) evaluates the risk of operations proposed by the analysis Agent — is the position size too large, is Gas worth it, is a liquidation line close — and can alert the user directly without going through the Orchestrator when necessary. An execution Agent (limited write) only executes on-chain operations when receiving an authorized Orchestrator instruction and when the risk Agent hasn't flagged high risk — single-operation amount cap, whitelist-only contract interactions. And an Orchestrator (coordination) combines the analysis Agent's recommendations with the risk Agent's assessment to decide whether to authorize the execution Agent — escalating to human confirmation rather than deciding autonomously when high risk or above-threshold situations arise.

Audit Design for Multi-Agent Systems

In a multi-agent system, the root cause of a problem can be buried deep — did the Orchestrator decide wrong? Did the data Agent return bad information? Did the execution Agent's tool fail? Good audit design must record every Agent's Thought/Action/Observation at every point in time, all inter-agent communication messages, and the trigger chain for every on-chain operation — which Agent's which Thought ultimately caused this transaction. Without this audit trail, post-incident investigation is nearly impossible. On-chain operation logs should ideally be stored in tamper-proof format (on-chain records or encrypted logs) as the authoritative source for later tracing.

What This Means for Your Money

If you're a developer planning to build multi-Agent crypto systems, remember three principles. First, minimum necessary permissions: each Sub-agent holds only the smallest tool set needed to fulfill its responsibility — minimizing the consequence of attacking any single Sub-agent. Second, trust doesn't propagate: execution Agents can't blindly trust the Orchestrator's word for high-risk operations — they need an independent human confirmation channel that goes directly to the real user, not through other Agents. Third, complete audit trail: every Agent's every decision must have a traceable record. Without audit, post-incident investigation is impossible — and in on-chain contexts where transactions are irreversible, audit is not optional.

Diagram

Feel free to share. Please credit the source.

Ask a Question

Related Terms

Useful Resources

Onchain Data / TVL → Onchain Dashboards → Block Explorer → Prices / Market Data → MCP Servers → LLM Benchmarks → Model Comparison →