How Maev Works
Overview
Your Agent
│
▼
maev.run(agent, gateway=True)
│
├── Load strategies (background thread, zero latency at call time)
├── Route LLM client via proxy (gateway.maev.dev)
│
▼
Agent runs, every LLM call intercepted
│
├── Retry failed or empty responses automatically
├── Check cost, call count, and duration circuit breakers
├── Detect prompt loops
├── Apply fallback models if primary fails
└── Apply learned strategies from previous runs
│
▼ async, non-blocking
Maev Ingest API
│
├── Normalize events
├── Classify failures (10 categories)
├── Store session timeline
├── Record call data for learning loop
└── Send alerts
│
▼
Maev Dashboard (Autopilot tab)What gets captured
When you call maev.run(agent, gateway=True), Maev patches your LLM clients in-process and routes calls through gateway.maev.dev. When called without the gateway flag, Maev only patches in-process and sends telemetry directly to the Maev ingest endpoint. Either way, every LLM call is intercepted automatically:
- Input prompts and output completions
- Model name and version
- Token counts and estimated cost
- Latency per call
- Tool and function calls, their arguments, and results
- Errors, retries, and fallbacks
- Autopilot interventions
Non-blocking by design
All telemetry is sent asynchronously. Maev adds under 5ms of overhead per LLM call. Your agent never waits for Maev. Circuit breakers and retries fire synchronously when needed, but observability never does.
The Hybrid Layer
To give you the best of both worlds, Maev couples a robust proxy Gateway with deep client-side SDK integration.
Unlike pure proxy solutions (which cannot touch your local code) or pure SDK solutions (which suffer from version-patching drift), Maev does both:
- Client-Side Interception:
maev.run()wraps your code natively. This allows Maev to stop infinite Python loops, enforce local budget constraints, and intercept local tool exceptions directly. - Gateway Proxying: LLM calls are transparently pushed to
gateway.maev.dev. This enables the backend to run prompt optimization cycles asynchronously, without burning CPU cycles on your servers.
Sessions
A session is one complete run of your agent from start to finish. Maev automatically:
- Creates a session on the first LLM call of a run, keyed by your API key and agent name
- Groups all subsequent LLM calls within a 5-minute idle window into that session
- Closes the session via a background cron job after the agent finishes or times out
- Runs failure classification on the captured events
Failure detection
After a session closes, Maev runs its classification engine against all events. The engine checks for 10 failure categories using rules-based pattern matching. If a failure is found, it is stored with the session and an alert is sent.
See Failure Classification for the full list.
Autopilot: self-healing and self-improving
Autopilot is Maev's active intervention layer. It operates at three levels:
Level 1 (Immediate): Mechanical protections active on every run from day one: retries, circuit breakers, loop detection, fallback models.
Level 2 (Pattern learning): After 20+ runs for an agent, Maev analyzes which interventions worked and starts applying winning strategies automatically.
Level 3 (Cross-agent): Strategies proven across many agents in your organization become global strategies applied to new agents before they have built up their own history.
See Autopilot for a full breakdown.