Skip to main content

Agent Architecture

WASP is an event-driven, multi-layer autonomous agent. Understanding the architecture helps you configure it correctly and extend it effectively.

Request Lifecycle

Every interaction — whether from Telegram or the dashboard — follows this pipeline:

User message


Redis Stream (events:incoming)


EventHandler.handle()

├─ Memory lookup (relevant context)
├─ Context builder (build_context)
│ ├─ System prompt + identity
│ ├─ Knowledge Graph block
│ ├─ Self-Model block
│ ├─ Epistemic State block
│ ├─ Temporal Insights block
│ ├─ Procedural Memory hints
│ └─ Behavioral Rules (corrections)


ModelManager.generate() → LLM response

├─ Parse skill calls (<skill>...</skill>)

▼ (if skill calls present)
SkillExecutor.execute()

├─ Anticipatory simulation (RESTRICTED/PRIVILEGED skills)
├─ Capability check + audit log
├─ Skill.execute(**params)
└─ Result appended to context

▼ (repeat up to MAX_SKILL_ROUNDS=12)
Final response assembled


Redis Stream (events:outgoing) → Telegram/Dashboard

Core Components

EventHandler (src/events/handlers.py)

The central dispatcher. It:

  • Reads messages from the events:incoming Redis stream
  • Builds the full LLM context via build_context()
  • Runs the skill execution loop (up to 12 rounds in Sovereign Mode)
  • Fires post-processing tasks (KG extraction, temporal update, epistemic update, procedure abstraction)
  • Writes the response to events:outgoing

ModelManager (src/models/manager.py)

Manages all AI providers and model selection:

  • Supports 10+ providers: OpenAI, Anthropic, Google, xAI, Mistral, DeepSeek, OpenRouter, Perplexity, HuggingFace, Moonshot, Ollama, LM Studio
  • Auto-detects configured providers at startup
  • Handles context overflow recovery (progressive truncation)
  • Tracks token usage for economics monitoring

SkillRegistry (src/skills/registry.py)

Maintains the catalogue of available skills:

  • Registers built-in and custom skills at startup
  • Enables/disables skills at runtime
  • Provides skill discovery for the planner

SkillExecutor (src/skills/executor.py)

Executes skill calls from LLM responses:

  • Parses <skill>name(param="value")</skill> syntax
  • Checks capability level and policy
  • Runs anticipatory simulation for RESTRICTED/PRIVILEGED skills
  • Writes to audit log
  • Handles parallel execution groups

MemoryManager (src/memory/manager.py)

Manages 8 memory systems:

  1. Episodic memory (conversation history)
  2. Semantic memory (facts, preferences)
  3. Working memory (session context)
  4. Procedural memory (learned procedures)
  5. Visual memory (screenshot index)
  6. Vector memory (semantic search, optional)
  7. Knowledge Graph (entity relationships)
  8. Self-Model (agent self-knowledge)

The Skill Loop

The skill loop is the heart of WASP's execution model:

for round_num in range(MAX_SKILL_ROUNDS):  # 12 in Sovereign Mode
response = await model_manager.generate(context)

skill_calls = parse_skill_calls(response)
if not skill_calls:
break # No more skills needed — final answer ready

results = await skill_executor.execute_batch(skill_calls)
context.append(results) # Feed results back to LLM

# Check for loop termination conditions
if _ALREADY_GONE_RE.search(response):
break # Response indicates task complete

Each round, the LLM can call multiple skills. Results are appended to context, and the LLM continues until it produces a final answer without skill calls.

Sovereign Mode

When SOVEREIGN_MODE=true (default):

  • MAX_SKILL_ROUNDS raised from 8 to 12
  • All autonomy limits relaxed
  • ⚡ SOVEREIGN MODE ACTIVE block injected into system prompt
  • goal_budget_max_replans effectively doubled

Identity and Persona

WASP's identity is defined in src/agent/context.py:

  • Name: Agent Wasp
  • Telegram handle: @lundclawbot
  • Default language: Spanish (matches user language dynamically)
  • Persona: proactive, action-first, concise

The operator can override identity and behavior via /data/config/prime.md — a high-priority operator override injected at the top of every system prompt.

Post-Message Processing

After every message, these tasks run asynchronously (fire-and-forget):

  1. Knowledge Graph extraction — entities and relationships extracted from conversation
  2. Temporal extraction — prices, events, state changes recorded in world timeline
  3. Epistemic update — domain confidence updated based on skill success/failure
  4. Procedure abstraction — multi-step solutions abstracted into reusable procedures
  5. Self-model update — agent self-knowledge updated with skill statistics

Error Handling

  • Context overflow: Automatically detected and progressively truncated (8→4→2→1 exchanges)
  • Skill failures: Results include error details; LLM can retry or use alternative approach
  • Model unavailability: ModelManager falls back to alternative providers
  • Circuit breakers: Integration connectors have circuit breakers (5 failures → 60s cooldown)