WASP Documentation
Current version: v2.6 — see Changelog for full history.
WASP is a self-hosted, single-operator autonomous AI agent. It runs as a Docker Compose stack on a VPS, accepts natural-language instructions through Telegram or a web dashboard, and executes them using a built-in skill library, a goal/plan engine, and a long-lived memory system.
It is designed for one operator. It is not a multi-tenant SaaS platform.
What WASP is
- An event-driven Python service that consumes user messages from Redis Streams, plans and executes work, and writes results to PostgreSQL plus a memory tree.
- A skill executor with 35+ built-in skills (browser, shell, Python, email, scheduler, sub-agents, self-modification) and a custom-skill loader.
- A goal orchestrator that turns objectives into validated TaskGraphs, runs them step by step, and replans on failure.
- A scheduler with 17 registered background jobs and additional opt-in jobs.
- A policy layer that gates side effects, validates response grounding, and enforces schedule honesty before any output reaches the user.
- A web dashboard with 35 routes covering chat, traces, memory, goals, agents, integrations, models, audit, and configuration.
What WASP is not
- It is not a chatbot framework. It is an operator-controlled agent that performs side-effects when explicitly asked.
- It is not a managed cloud service. You run it on your own VPS.
- It is not multi-tenant out of the box.
- It is not a guarantee of correctness. The policy layer reduces hallucinations and unsafe actions, but the underlying LLM is probabilistic. See Known Limitations.
Architecture summary
Six core services run as Docker containers. A seventh (agent-ollama) is optional for fully-local LLM operation.
Key Systems
| System | Description |
|---|---|
| Goal Engine | Decomposes objectives into TaskGraphs (≤ 8 steps), executes with Plan Critic validation, replans on failure |
| Skills | 35 built-in skills across 5 capability levels; custom Python skills supported |
| Memory | 10 persistent memory layers: episodic, semantic/vector, knowledge graph, procedural, behavioral, learning examples, visual, goal-scoped, temporal world model |
| Scheduler | 17 registered jobs (health, reflection, reminders, monitors, custom tasks, dream, autonomous goals, audit retention, weekly DB maintenance) plus opt-in jobs |
| Policy Layer | Intent Gate, Action Announcer, Response Guard, Response Validator, Decision Trace — deterministic post-LLM guards |
| Resource Governor | Per-user rate limiting: goal slots, LLM budget, API call caps |
| Decision Layer | Pre-LLM heuristic classifier with 13 fast-paths, routes requests to 5 strategies before the LLM is invoked |
| Active Flow Lock | Per-chat Redis state (TTL 15 min) anchors follow-up messages to the correct domain |
| Multi-Agent | Spawn sub-agents with their own goal queues; Meta-Agent Supervisor decomposes into teams |
| Integrations | 44 connectors: Slack, Discord, GitHub, Notion, Telegram, Gmail, smart home, exchange APIs |
| Self-modification | self_improve skill with syntax validation, timestamped backups, soft safety gate, persistent patches |
| Panic Reset | Single-operation hard reset wipes all 17 cognitive tables + Redis state + runs VACUUM FULL |
Safety model
Five enforcement points run after the LLM produces a candidate response. All deterministic; none call the LLM:
- Intent Gate — blocks side-effect skills without explicit user intent.
- Action Announcer — strips claims of actions the agent did not actually execute.
- Response Guard — schedule honesty (bidirectional), factual grounding (entity-proximity), markdown sanitizer.
- Response Validator — deterministic grounding/incomplete/drift check; triggers one corrective LLM round if it fails.
- Decision Trace — every response emits a forensic record (visible at
/traces).
Regression tests (tests/test_policy_regressions.py) run at Docker image build time and block the build on failure.
Quick start
git clone <repo> /home/agent
cd /home/agent
bash scripts/init.sh
nano .env # set TELEGRAM_BOT_TOKEN, TELEGRAM_ALLOWED_USERS, model API key
docker compose up -d --build
docker compose ps # all services healthy
curl http://localhost:8080/health
Then send a message via Telegram or open the dashboard at http://<host>:8080.
See Installation for full prerequisites and walkthrough.
Reading order
| If you want to... | Start with |
|---|---|
| Install WASP | Installation |
| Use WASP day to day | Operator Commands |
| Understand the architecture | Agent Architecture |
| Understand the safety model | Skill Safety |
| Understand memory and learning | Memory |
| Understand the scheduler | Scheduler |
| Add a new skill | Creating Skills |
| Diagnose a problem | Common Errors |
| Run audits | Testing and Audit |
| Know the limits | Known Limitations |
Current readiness
Single-operator production. Verified by:
- A regression suite (50+ deterministic policy assertions) that runs in CI at image build time.
- Multiple internal forensic audits covering mandatory tests, edge validations, adversarial prompts, state consistency, cross-layer integrity, observability, and stress.
- A Panic Reset workflow that wipes all cognitive state with operator confirmation.
WASP is not certified for multi-tenant deployment, regulated industries, or unattended operation without monitoring. See Known Limitations.