Capability Evolution Engine
The Capability Evolution Engine (CEE) extends WASP's skill set automatically. When the agent repeatedly fails a task or when the Self-Reflection Engine identifies a missing ability, the CEE detects the gap, generates candidate skill code via LLM, validates it in a sandbox, and registers the new skill — without any human involvement.
Flow
Goal reaches FAILED state
│
▼
GoalOrchestrator (fire-and-forget, non-blocking)
│
▼
CapabilityEvolutionEngine.analyze_gap()
├── Rate-limit check (Redis counters)
├── Compute gap_score (3 signals → threshold)
├── LLM: generate candidate skill code
├── Validate: AST + security blocklist + structural checks
├── Register: write file → dynamic load → SkillRegistry
├── Store: MemoryType.META (for future context injection)
└── Log: /data/logs/capability_evolution.log
The engine also runs as a scheduled job every 3600 seconds to catch gaps from accumulated failures.
Gap Score
Three independent signals are combined:
| Signal | Weight | Source |
|---|---|---|
| Repeated failures | 0.6 | goal.stability.consecutive_failures |
| Error keywords | 0.2 | Error text patterns (ImportError, "not found", etc.) |
| Reflection gaps | 0.2 | ReflectionEngine keywords ("no tool", "missing skill", etc.) |
gap_score = 0.6 × failures_signal + 0.2 × error_signal + 0.2 × reflection_signal
Evolution is only attempted when gap_score > 0.50, preventing noise from isolated failures.
Generated Skill Structure
All generated skills follow the SkillBase interface exactly:
from src.skills.base import SkillBase
from src.skills.types import SkillDefinition, SkillResult
class GeneratedSkill(SkillBase):
name = "my_capability"
def definition(self) -> SkillDefinition:
return SkillDefinition(
name=self.name,
description="Auto-generated: what it does",
params=[],
category="generated",
capability_level="safe", # Always SAFE — least privilege
timeout_seconds=10.0,
)
async def execute(self, **kwargs) -> SkillResult:
try:
result = "computed output"
return SkillResult(skill_name=self.name, success=True, output=result)
except Exception as e:
return SkillResult(skill_name=self.name, success=False, output="", error=str(e))
Generated skills are written to /data/skills/generated/{capability_name}/skill.py and loaded dynamically via importlib.util.spec_from_file_location.
Sandbox Validation
Before any skill is registered, it passes four validation gates:
| Check | Description |
|---|---|
| AST parse | ast.parse(code) — syntax must be valid Python |
| Security blocklist | No os.system, subprocess, eval, exec, open, socket, requests, etc. |
| Structural check | Must contain class GeneratedSkill, def definition, async def execute, SkillResult, SkillDefinition |
| Name check | Skill name must appear as a string literal in the code |
If any check fails, the code is discarded and the evolution event is logged as rejected.
Safety Constraints
| Constraint | Value |
|---|---|
| Max evolutions per day | 5 |
| Max evolutions per hour | 2 |
| Generated capability level | Always SAFE |
| Overwrites existing skills | Never |
| Blocks on Redis failure | Never (fail-open) |
| Crashes main loop | Never (all errors are caught) |
Rate Limiting
Redis keys with TTL-based expiry:
cee:daily:{unix_day} → daily evolution count (TTL 86400s)
cee:hourly:{unix_hour} → hourly evolution count (TTL 3600s)
Both counters degrade to "allow" if Redis is unavailable.
Memory Integration
After a successful evolution, the capability is stored in MemoryType.META:
{
"capability_name": "my_capability",
"gap_score": 0.72,
"task_context": "original objective that triggered evolution...",
"timestamp": "2026-03-11T20:00:00+00:00",
"source": "capability_evolution_engine",
}
Tags: ["generated_skill", "capability_evolution", "{capability_name}"]
This allows the agent to know about its own evolved capabilities in future context.
Audit Log
Every evolution event (successful or not) is appended to:
/data/logs/capability_evolution.log
Each line is a JSON object:
{
"ts": "2026-03-11T20:00:00+00:00",
"capability_name": "parse_json_schema",
"trigger_reason": "gap_score=0.72,goal=abc12345",
"phase": "complete",
"result": "success"
}
Possible result values: success, skipped, rejected, failed.
Integration Points
| Component | How |
|---|---|
| GoalOrchestrator | capability_evolution_engine attribute (late-wired); fire-and-forget task on GoalState.FAILED |
| ReflectionEngine | engine.reflection_engine attribute; reads reflections for gap signals |
| SkillRegistry | skill_registry.register(instance) after successful validation |
| MemoryManager | store_memory(MemoryType.META) after successful registration |
| Scheduler | CapabilityEvolutionJob runs every 3600s |
Source
src/capability_evolution_engine.py — CapabilityEvolutionEngine class
src/scheduler/capability_evolution.py — CapabilityEvolutionJob
Inspecting Generated Skills
# List all generated skills
ls /data/skills/generated/
# View a specific generated skill
cat /data/skills/generated/<capability_name>/skill.py
# View evolution log
tail -f /data/logs/capability_evolution.log
# Check rate-limit counters
docker exec agent-redis redis-cli KEYS "cee:*"
Disabling
The engine is initialized conditionally — it requires skill_registry to be available. To disable it without modifying code, simply set skills_enabled=false in config (which also disables all skills globally).
To disable just the periodic job while keeping the fire-and-forget hook, remove the capability_evolution scheduler registration in main.py.