Agent Architecture
WASP is an event-driven, multi-layer autonomous agent. Understanding the architecture helps you configure it correctly and extend it effectively.
Request Lifecycle
Every interaction — whether from Telegram or the dashboard — follows this pipeline:
User message
│
▼
Redis Stream (events:incoming)
│
▼
EventHandler.handle()
│
├─ Memory lookup (relevant context)
├─ Context builder (build_context)
│ ├─ System prompt + identity
│ ├─ Knowledge Graph block
│ ├─ Self-Model block
│ ├─ Epistemic State block
│ ├─ Temporal Insights block
│ ├─ Procedural Memory hints
│ └─ Behavioral Rules (corrections)
│
▼
ModelManager.generate() → LLM response
│
├─ Parse skill calls (<skill>...</skill>)
│
▼ (if skill calls present)
SkillExecutor.execute()
│
├─ Anticipatory simulation (RESTRICTED/PRIVILEGED skills)
├─ Capability check + audit log
├─ Skill.execute(**params)
└─ Result appended to context
│
▼ (repeat up to MAX_SKILL_ROUNDS=12)
Final response assembled
│
▼
Redis Stream (events:outgoing) → Telegram/Dashboard
Core Components
EventHandler (src/events/handlers.py)
The central dispatcher. It:
- Reads messages from the
events:incomingRedis stream - Builds the full LLM context via
build_context() - Runs the skill execution loop (up to 12 rounds in Sovereign Mode)
- Fires post-processing tasks (KG extraction, temporal update, epistemic update, procedure abstraction)
- Writes the response to
events:outgoing
ModelManager (src/models/manager.py)
Manages all AI providers and model selection:
- Supports 10+ providers: OpenAI, Anthropic, Google, xAI, Mistral, DeepSeek, OpenRouter, Perplexity, HuggingFace, Moonshot, Ollama, LM Studio
- Auto-detects configured providers at startup
- Handles context overflow recovery (progressive truncation)
- Tracks token usage for economics monitoring
SkillRegistry (src/skills/registry.py)
Maintains the catalogue of available skills:
- Registers built-in and custom skills at startup
- Enables/disables skills at runtime
- Provides skill discovery for the planner
SkillExecutor (src/skills/executor.py)
Executes skill calls from LLM responses:
- Parses
<skill>name(param="value")</skill>syntax - Checks capability level and policy
- Runs anticipatory simulation for RESTRICTED/PRIVILEGED skills
- Writes to audit log
- Handles parallel execution groups
MemoryManager (src/memory/manager.py)
Manages 8 memory systems:
- Episodic memory (conversation history)
- Semantic memory (facts, preferences)
- Working memory (session context)
- Procedural memory (learned procedures)
- Visual memory (screenshot index)
- Vector memory (semantic search, optional)
- Knowledge Graph (entity relationships)
- Self-Model (agent self-knowledge)
The Skill Loop
The skill loop is the heart of WASP's execution model:
for round_num in range(MAX_SKILL_ROUNDS): # 12 in Sovereign Mode
response = await model_manager.generate(context)
skill_calls = parse_skill_calls(response)
if not skill_calls:
break # No more skills needed — final answer ready
results = await skill_executor.execute_batch(skill_calls)
context.append(results) # Feed results back to LLM
# Check for loop termination conditions
if _ALREADY_GONE_RE.search(response):
break # Response indicates task complete
Each round, the LLM can call multiple skills. Results are appended to context, and the LLM continues until it produces a final answer without skill calls.
Sovereign Mode
When SOVEREIGN_MODE=true (default):
MAX_SKILL_ROUNDSraised from 8 to 12- All autonomy limits relaxed
⚡ SOVEREIGN MODE ACTIVEblock injected into system promptgoal_budget_max_replanseffectively doubled
Identity and Persona
WASP's identity is defined in src/agent/context.py:
- Name: Agent Wasp
- Telegram handle:
@lundclawbot - Default language: Spanish (matches user language dynamically)
- Persona: proactive, action-first, concise
The operator can override identity and behavior via /data/config/prime.md — a high-priority operator override injected at the top of every system prompt.
Post-Message Processing
After every message, these tasks run asynchronously (fire-and-forget):
- Knowledge Graph extraction — entities and relationships extracted from conversation
- Temporal extraction — prices, events, state changes recorded in world timeline
- Epistemic update — domain confidence updated based on skill success/failure
- Procedure abstraction — multi-step solutions abstracted into reusable procedures
- Self-model update — agent self-knowledge updated with skill statistics
Error Handling
- Context overflow: Automatically detected and progressively truncated (8→4→2→1 exchanges)
- Skill failures: Results include error details; LLM can retry or use alternative approach
- Model unavailability: ModelManager falls back to alternative providers
- Circuit breakers: Integration connectors have circuit breakers (5 failures → 60s cooldown)