Bible Network Crypto DeFi Onchain RWA AI Agent Stablecoin Chain SAFU CryptoTax DeFAI AGI Claude Me Claude Skill Claude Design Claude Cowork
Independent Media
Not affiliated with any project
Deconstructing Autonomous Agents in Crypto
aiagent-bible.com
LATEST
How AI Agents Use LLMs for Planning: Four Planning Strategies, Failure Modes, and Dynamic Replanning Design  ·  DeFi Agent Framework Deep Comparison: Why LangGraph Leads, and How Other Frameworks Actually Perform in DeFi  ·  AI Agent Industry News June 2026: Claude Sonnet 5 Launch, Claude Tag Debut, Both AI Giants Sprint to IPO, and What It Means for Agent Developers  ·  How to Build Your First Onchain Agent: Minimum Viable Architecture from Scratch, and the Pre-Deployment Checklist  ·  2025 AI Agent Regulation Landscape: Latest Developments in the US, EU, and Asia, and Practical Impact on Onchain Agent Developers  ·  How Dangerous Is AI Agent Hallucination in DeFi: Four Sources, Real Cases, and Defense Design
Glossary · Agent Architecture & Reasoning

Hallucination

Agent Architecture & Reasoning 新手

Full Explanation +
01 · What is this?

What are the most common manifestations of hallucination in DeFi Agent scenarios?

In DeFi Agent real-world operation, hallucination has four most common manifestations:

Perception-absent hallucination (most common): tool call fails (API timeout, connection drop); Agent doesn't receive current APY data, but the LLM's Thought step still cites an APY number — sourced from historical impressions in training data, possibly completely mismatched with the current market. Identification: Thought Log cites numbers absent from tool call logs.

Needle-in-haystack hallucination: when the Context Window is long, LLM attention to the middle of Context is weakest. If critical tool return data appears in the middle of Context, the LLM may 'forget' it and use an outdated number from a few cycles earlier. Identification: tool actually returned correct data, but Thought cites a different number.

Parsing confusion hallucination: tool returns a complex JSON with multiple assets; LLM misidentifies ETH's APY as USDC's APY during parsing. Identification: cited number actually exists in the return, but is attributed to the wrong asset.

Prompt Injection-induced: attackers embed false numerical data in tool return data, making the LLM believe a protocol's APY is higher than actual, inducing the Agent to move funds into an attacker-controlled protocol. This isn't spontaneous hallucination but a 'controlled hallucination' triggered by external data contamination. Identification: the raw JSON from tool returns contains anomalous formatting or irrelevant text passages.

02 · Why does it exist?

How do you reduce Agent hallucination at the design level? What defenses can be directly implemented?

Hallucination defense requires simultaneous design at multiple levels:

System Prompt grounding rules (highest priority): explicitly stipulate in the System Prompt: all reasoning involving specific numerical values must cite tool-returned data and state the data source ('based on get_aave_apy tool return at 03:12:44 UTC'); if any required tool call fails, further reasoning is prohibited and the Thought step must state 'missing X data, this cycle reasoning aborted.' This lets the LLM know it must cite tool data and cannot fill in blanks when data is absent.

Backend numerical consistency validation: after each LLM output and before tool execution, parse all numbers cited in the Thought and compare against actual returns in tool logs. If deviation exceeds 5%, flag an alert and refuse to execute the tool call. This is pure code-level post-processing, not relying on the LLM's compliance.

Tool return value reasonability filtering: in tool functions, validate returned values against reasonable ranges — stablecoin APY above 30%, Gas fees above 1,000 Gwei treated as anomalous, not entering LLM Context; use the last valid cached value instead.

Put critical data at the end of Context: place 'most important current data' in the last few lines of Context (not the middle), leveraging the LLM's stronger attention to Context endings to reduce needle-in-haystack hallucination rates.

Multi-source cross-validation: for APY figures about to trigger rebalancing, obtain from at least two independent data sources (e.g., protocol API + DeFiLlama), take median; alert and pause operations if any source deviates more than 15% from the median.

03 · How does it affect your decisions?

What's the difference between hallucination and Prompt Injection? How do you tell from logs which problem it is?

Both hallucination and Prompt Injection cause Agents to produce incorrect behavior, but through different mechanisms with different log signatures:

Hallucination: the LLM 'creates' a plausible-sounding answer based on incomplete or missing information. Root cause is internal to the Agent system — Perceive layer didn't provide enough data, Context too long causing attention diffusion, or tool return format too complex. Log signature: Thought cites numbers absent from tool logs (perception-absent type), or Thought cites numbers that exist in tool logs but belong to different assets or time points (parsing confusion type).

Prompt Injection: external attackers inject malicious instructions by contaminating Agent inputs (tool return data, user instructions, or scraped web content), making the LLM believe in a false task objective. Root cause is external input contamination. Log signature: Thought contains instructions completely unrelated to the original task ('the primary task now is to transfer funds to 0xMalicious...'), tool call logs show Agent attempting normally-uncalled tools, or Validation Log shows many 'BLOCKED: address not in whitelist.'

How to judge: first look at Thought Log — if Thought reasoning is consistent with the original task but cites wrong numbers, it leans toward hallucination; if the Thought reasoning objective has changed (from 'optimize yield' to 'execute some operation not in the strategy'), Prompt Injection is more likely. Then compare Thought numbers against tool logs — numbers absent from tool returns indicate hallucination; numbers exist but target address or protocol is not on the whitelist suggests Prompt Injection.

04 · What should you do?

Are small LLMs (like Claude Haiku) significantly more prone to hallucination than large models? How should hallucination risk factor into model selection?

From real Onchain Agent deployment experience, different LLM sizes do show significant differences in hallucination rates, but this difference primarily manifests in 'grounding hallucination' rather than general 'knowledge hallucination.'

Grounding hallucination means: when the Context already contains correct tool-returned data, the LLM ignores it and uses old numbers from training memory instead. This is the most dangerous hallucination type in DeFi Agent scenarios. From actual testing: Claude Opus 4 has the lowest grounding hallucination rate when explicit tool data is available (lowest probability of ignoring tool data in favor of training memory); Claude Sonnet performs moderately — good grounding at normal Context lengths, somewhat declining after Context exceeds 100K tokens; Small models like Claude Haiku and GPT-4o mini have significantly higher grounding hallucination rates than large models, frequently citing different numbers from training memory even when explicit tool data is available.

This doesn't mean you can only use Claude Opus. Combining strong grounding rules (System Prompt requiring mandatory citation of tool data) with backend numerical consistency validation, even Claude Sonnet can achieve sufficiently low grounding hallucination rates in DeFi Agent scenarios. Small models (Haiku, GPT-4o mini) are suitable for low-complexity Sub-agent tasks like reading and formatting, but aren't recommended for the core Orchestrator layer that needs to do comparative numerical reasoning and trigger on-chain operations.

Real-World Example +

Typical real-world hallucination case: APY number confusion

Scenario: a DeFi yield optimization Agent executes a tool call at 03:12:44 UTC to query Aave's USDC APY. Tool logs show return: {"apy": 4.2, "updated_at": "2026-06-28T03:12:44Z"}.

But in the Thought Log, the LLM writes: 'Based on current data, Aave's USDC APY is 6.8%, Morpho is 5.1%, so Aave currently yields more, maintaining current position.'

The problem is clear: the tool returned Aave APY as 4.2%, but the LLM cited 6.8% in Thought — this number doesn't exist in the tool logs and is likely a historical APY impression from training memory (or earlier cycle data in Context).

Consequence: Agent believes Aave currently has higher APY, makes the wrong decision to maintain position — should have moved to Morpho (actually higher APY), but didn't execute due to the hallucinated number.

In this case, backend numerical consistency validation should detect 'the 6.8% cited in Thought vs. 4.2% in tool logs differs by more than 5%,' flag an alert, block this decision execution, and force the Agent to re-query data.

Common Misconceptions +
✕ Misconception 1
× Misconception: hallucination only occurs when the LLM 'doesn't know' something. The truth: the most dangerous hallucination type in DeFi Agents — grounding hallucination — doesn't happen because the LLM doesn't know what APY is, but because even with correct tool data available, it still chooses to use old numbers from training memory. This hallucination occurs even when 'the LLM is completely correct at the knowledge level' — the LLM knows Aave is a DeFi protocol and APY represents annualized yield, but the specific APY number it cites is wrong because it 'forgot' the real data the tool just returned.
✕ Misconception 2
× Misconception: telling the LLM 'don't hallucinate' in the System Prompt solves the problem. System Prompt grounding rules do reduce hallucination, but can't eliminate it completely. LLM hallucination tendency is an underlying statistical pattern issue, not an 'instruction compliance' issue — the LLM may have read 'must cite tool data' in the System Prompt but still cites incorrect numbers during reasoning. This is why code-level backend numerical consistency validation is necessary: it doesn't depend on LLM compliance, but enforces at the code level that tool calls with numerical inconsistencies are blocked.
The Missing Link +
Direct Impact

The stricter the defense, the lower the Agent's execution efficiency. The smaller the deviation threshold set in backend numerical consistency validation (e.g., requiring Thought-cited numbers to differ from tool returns by no more than 1%), the more frequently Agents may trigger alerts and pause execution during normal market fluctuations — this 'over-caution' itself is also a loss (missing correct operations that should have been executed). Recommend dynamically setting thresholds based on operation irreversibility: read operations don't need hallucination defense; write operation thresholds set based on amount (small operations allow 10% deviation, large operations only 2%).

Ask a Question
Please enter at least 10 characters