Question 1

Why does Hallucination matter?

Accepted Answer

**What are the most common manifestations of hallucination in DeFi Agent scenarios?**

In DeFi Agent real-world operation, hallucination has four most common manifestations:

**Perception-absent hallucination (most common)**: tool call fails (API timeout, connection drop); Agent doesn't receive current APY data, but the LLM's Thought step still cites an APY number — sourced from historical impressions in training data, possibly completely mismatched with the current market. Identification: Thought Log cites numbers absent from tool call logs.

**Needle-in-haystack hallucination**: when the Context Window is long, LLM attention to the middle of Context is weakest. If critical tool return data appears in the middle of Context, the LLM may 'forget' it and use an outdated number from a few cycles earlier. Identification: tool actually returned correct data, but Thought cites a different number.

**Parsing confusion hallucination**: tool returns a complex JSON with multiple assets; LLM misidentifies ETH's APY as USDC's APY during parsing. Identification: cited number actually exists in the return, but is attributed to the wrong asset.

**Prompt Injection-induced**: attackers embed false numerical data in tool return data, making the LLM believe a protocol's APY is higher than actual, inducing the Agent to move funds into an attacker-controlled protocol. This isn't spontaneous hallucination but a 'controlled hallucination' triggered by external data contamination. Identification: the raw JSON from tool returns contains anomalous formatting or irrelevant text passages.

Question 2

How does Hallucination work?

Accepted Answer

**How do you reduce Agent hallucination at the design level? What defenses can be directly implemented?**

Hallucination defense requires simultaneous design at multiple levels:

**System Prompt grounding rules (highest priority)**: explicitly stipulate in the System Prompt: all reasoning involving specific numerical values must cite tool-returned data and state the data source ('based on get_aave_apy tool return at 03:12:44 UTC'); if any required tool call fails, further reasoning is prohibited and the Thought step must state 'missing X data, this cycle reasoning aborted.' This lets the LLM know it must cite tool data and cannot fill in blanks when data is absent.

**Backend numerical consistency validation**: after each LLM output and before tool execution, parse all numbers cited in the Thought and compare against actual returns in tool logs. If deviation exceeds 5%, flag an alert and refuse to execute the tool call. This is pure code-level post-processing, not relying on the LLM's compliance.

**Tool return value reasonability filtering**: in tool functions, validate returned values against reasonable ranges — stablecoin APY above 30%, Gas fees above 1,000 Gwei treated as anomalous, not entering LLM Context; use the last valid cached value instead.

**Put critical data at the end of Context**: place 'most important current data' in the last few lines of Context (not the middle), leveraging the LLM's stronger attention to Context endings to reduce needle-in-haystack hallucination rates.

**Multi-source cross-validation**: for APY figures about to trigger rebalancing, obtain from at least two independent data sources (e.g., protocol API + DeFiLlama), take median; alert and pause operations if any source deviates more than 15% from the median.

Question 3

How is Hallucination applied in practice?

Accepted Answer

**What's the difference between hallucination and Prompt Injection? How do you tell from logs which problem it is?**

Both hallucination and Prompt Injection cause Agents to produce incorrect behavior, but through different mechanisms with different log signatures:

**Hallucination**: the LLM 'creates' a plausible-sounding answer based on incomplete or missing information. Root cause is internal to the Agent system — Perceive layer didn't provide enough data, Context too long causing attention diffusion, or tool return format too complex. Log signature: Thought cites numbers absent from tool logs (perception-absent type), or Thought cites numbers that exist in tool logs but belong to different assets or time points (parsing confusion type).

**Prompt Injection**: external attackers inject malicious instructions by contaminating Agent inputs (tool return data, user instructions, or scraped web content), making the LLM believe in a false task objective. Root cause is external input contamination. Log signature: Thought contains instructions completely unrelated to the original task ('the primary task now is to transfer funds to 0xMalicious...'), tool call logs show Agent attempting normally-uncalled tools, or Validation Log shows many 'BLOCKED: address not in whitelist.'

**How to judge**: first look at Thought Log — if Thought reasoning is consistent with the original task but cites wrong numbers, it leans toward hallucination; if the Thought reasoning objective has changed (from 'optimize yield' to 'execute some operation not in the strategy'), Prompt Injection is more likely. Then compare Thought numbers against tool logs — numbers absent from tool returns indicate hallucination; numbers exist but target address or protocol is not on the whitelist suggests Prompt Injection.