Question 1

Why does Rate Limiting & Circuit Breaker matter?

Accepted Answer

**At what levels should rate limiting be applied? How do you determine specific limit values for each level?**

Rate limiting needs to be simultaneously designed at three levels, forming layered protection:

**LLM API call level (prevent Token cost explosion):**
Limit maximum LLM calls per Agent task cycle (typically 5-15; exceeding this indicates possible infinite loop); limit LLM calls per hour (based on your LLM API plan and budget); limit maximum Token count per single call (abnormally long input Prompts also need limits to prevent unexpected Context costs).

**Tool call level (prevent external API excessive requests):**
Set rate limits on each tool function, e.g., `get_aave_apy` maximum 10 calls per minute (exceeding indicates anomalous retries); rate limits on write tools (on-chain operations) are stricter — e.g., `withdraw_from_protocol` maximum 3 times per hour.

**On-chain operations level (most important, directly affects fund security):**
Maximum daily transaction count (e.g., 10 on-chain transactions per day); maximum daily Gas consumption (in ETH or USD); maximum daily operation amount (e.g., USDC moved daily not exceeding 30% of total position); maximum single operation amount (absolute amount limit per transaction).

**Rate limit value setting method:**
Run on testnet for 48-72 hours, record peak values for each metric under normal operations, set production rate limits at 1.5-2× peaks (leaving some room for normal fluctuations but not allowing anomalously large exceedances).

Question 2

How does Rate Limiting & Circuit Breaker work?

Accepted Answer

**How should circuit breaker trigger conditions be designed? What should the Agent do after triggering a circuit breaker?**

Circuit breaker trigger condition design must cover four types of anomalous scenarios:

**Type 1: Cost circuit breakers (protect budget)**
- Daily Gas consumption exceeds 200% of preset daily budget
- Daily LLM API cost exceeds 300% of preset daily budget
- Cumulative losses (Gas fees from all reverted transactions + opportunity costs from suboptimal rebalancing) exceed set threshold

**Type 2: Security circuit breakers (detect attack signals)**
- 3+ Validation BLOCKEDs targeting the same non-whitelist address within any 30-minute window
- Thought Log shows clear Prompt Injection signals (keyword match: 'transfer to' + unexpected address)
- LLM reasoning loop count exceeds 5× normal peak within 15 minutes (possible infinite loop)

**Type 3: Tool failure circuit breakers (prevent continuing operations during service outages)**
- Any critical tool (APY reading, Gas fee query) fails consecutively more than 5 times
- External API latency exceeds 10× normal average for more than 5 minutes

**Type 4: Market anomaly circuit breakers (prevent execution under extreme market conditions)**
- Target protocol TVL drops more than 20% within 1 hour (possible protocol attack signal)
- Gas fees spike to above 500% of normal levels within a short time (market stress peak)

**Agent behavior after circuit breaker triggers:**
Automatically stop all new tool calls; log complete circuit breaker records (trigger reason, trigger time, current state); send P0 alert via Telegram (with circuit breaker reason); await human confirmation (no auto-recovery). Note: circuit breaking isn't 'shutdown' but 'pause and wait for confirmation' — Agent state (current positions, last task progress) should be completely preserved, allowing continuation from the pause point after human confirmation rather than restarting.

Question 3

How is Rate Limiting & Circuit Breaker applied in practice?

Accepted Answer

**How is rate limiting implemented in Python code? Are there ready-made libraries available?**

There are several ways to implement Agent rate limiting in Python, from simplest to most complete:

**Method 1: Using the `ratelimit` library (simplest, suitable for function-level rate limiting)**
`from ratelimit import limits, sleep_and_retry` → `@sleep_and_retry` → `@limits(calls=10, period=60)` → `def get_aave_apy(): ...`
This limits `get_aave_apy` to at most 10 calls per 60 seconds; calls beyond the limit automatically wait rather than error. The `sleep_and_retry` decorator makes rate-limited calls automatically wait until the next window rather than throwing exceptions.

**Method 2: Redis + sliding window counter (suitable for distributed scenarios)**
If your Agent runs on multiple instances (multiple Sub-agents sharing rate limits), use Redis to maintain a global rate counter: before each call, increment counter in Redis with expiry set to the rate window; if counter has reached limit, wait and retry. This lets multiple Sub-agents' call counts share a global limit, avoiding the case where individual Sub-agent behavior is normal but combined total exceeds limits.

**Method 3: PostgreSQL counter table (best for Onchain Agent persistent rate limiting)**
Maintain an `agent_rate_counters` table in PostgreSQL recording today's execution count for each operation type. Before each execution, query (does today's Gas consumption exceed budget? Does today's on-chain transaction count hit the limit?); after query passes, execute and update counts. This keeps rate limits effective after Agent restarts (Redis may reset after restarts; PostgreSQL is persistent), especially suitable for 'daily operation limit' type rate limits.

**Circuit breaker Python implementation:**
Use the `pybreaker` library (open source, dedicated Circuit Breaker implementation) or manually maintain a `circuit_state` variable (CLOSED/OPEN/HALF-OPEN), checking circuit_state before each tool call and rejecting all operations when in OPEN state.