Skip to main content

Capability Evolution Engine

The Capability Evolution Engine (CEE) extends WASP's skill set automatically. When the agent repeatedly fails a task or when the Self-Reflection Engine identifies a missing ability, the CEE detects the gap, generates candidate skill code via LLM, validates it in a sandbox, and registers the new skill — without any human involvement.

Flow

Goal reaches FAILED state


GoalOrchestrator (fire-and-forget, non-blocking)


CapabilityEvolutionEngine.analyze_gap()
├── Rate-limit check (Redis counters)
├── Compute gap_score (3 signals → threshold)
├── LLM: generate candidate skill code
├── Validate: AST + security blocklist + structural checks
├── Register: write file → dynamic load → SkillRegistry
├── Store: MemoryType.META (for future context injection)
└── Log: /data/logs/capability_evolution.log

The engine also runs as a scheduled job every 3600 seconds to catch gaps from accumulated failures.

Gap Score

Three independent signals are combined:

SignalWeightSource
Repeated failures0.6goal.stability.consecutive_failures
Error keywords0.2Error text patterns (ImportError, "not found", etc.)
Reflection gaps0.2ReflectionEngine keywords ("no tool", "missing skill", etc.)
gap_score = 0.6 × failures_signal + 0.2 × error_signal + 0.2 × reflection_signal

Evolution is only attempted when gap_score > 0.50, preventing noise from isolated failures.

Generated Skill Structure

All generated skills follow the SkillBase interface exactly:

from src.skills.base import SkillBase
from src.skills.types import SkillDefinition, SkillResult

class GeneratedSkill(SkillBase):
name = "my_capability"

def definition(self) -> SkillDefinition:
return SkillDefinition(
name=self.name,
description="Auto-generated: what it does",
params=[],
category="generated",
capability_level="safe", # Always SAFE — least privilege
timeout_seconds=10.0,
)

async def execute(self, **kwargs) -> SkillResult:
try:
result = "computed output"
return SkillResult(skill_name=self.name, success=True, output=result)
except Exception as e:
return SkillResult(skill_name=self.name, success=False, output="", error=str(e))

Generated skills are written to /data/skills/generated/{capability_name}/skill.py and loaded dynamically via importlib.util.spec_from_file_location.

Sandbox Validation

Before any skill is registered, it passes four validation gates:

CheckDescription
AST parseast.parse(code) — syntax must be valid Python
Security blocklistNo os.system, subprocess, eval, exec, open, socket, requests, etc.
Structural checkMust contain class GeneratedSkill, def definition, async def execute, SkillResult, SkillDefinition
Name checkSkill name must appear as a string literal in the code

If any check fails, the code is discarded and the evolution event is logged as rejected.

Safety Constraints

ConstraintValue
Max evolutions per day5
Max evolutions per hour2
Generated capability levelAlways SAFE
Overwrites existing skillsNever
Blocks on Redis failureNever (fail-open)
Crashes main loopNever (all errors are caught)

Rate Limiting

Redis keys with TTL-based expiry:

cee:daily:{unix_day}    → daily evolution count  (TTL 86400s)
cee:hourly:{unix_hour} → hourly evolution count (TTL 3600s)

Both counters degrade to "allow" if Redis is unavailable.

Memory Integration

After a successful evolution, the capability is stored in MemoryType.META:

{
"capability_name": "my_capability",
"gap_score": 0.72,
"task_context": "original objective that triggered evolution...",
"timestamp": "2026-03-11T20:00:00+00:00",
"source": "capability_evolution_engine",
}

Tags: ["generated_skill", "capability_evolution", "{capability_name}"]

This allows the agent to know about its own evolved capabilities in future context.

Audit Log

Every evolution event (successful or not) is appended to:

/data/logs/capability_evolution.log

Each line is a JSON object:

{
"ts": "2026-03-11T20:00:00+00:00",
"capability_name": "parse_json_schema",
"trigger_reason": "gap_score=0.72,goal=abc12345",
"phase": "complete",
"result": "success"
}

Possible result values: success, skipped, rejected, failed.

Integration Points

ComponentHow
GoalOrchestratorcapability_evolution_engine attribute (late-wired); fire-and-forget task on GoalState.FAILED
ReflectionEngineengine.reflection_engine attribute; reads reflections for gap signals
SkillRegistryskill_registry.register(instance) after successful validation
MemoryManagerstore_memory(MemoryType.META) after successful registration
SchedulerCapabilityEvolutionJob runs every 3600s

Source

src/capability_evolution_engine.pyCapabilityEvolutionEngine class src/scheduler/capability_evolution.pyCapabilityEvolutionJob

Inspecting Generated Skills

# List all generated skills
ls /data/skills/generated/

# View a specific generated skill
cat /data/skills/generated/<capability_name>/skill.py

# View evolution log
tail -f /data/logs/capability_evolution.log

# Check rate-limit counters
docker exec agent-redis redis-cli KEYS "cee:*"

Disabling

The engine is initialized conditionally — it requires skill_registry to be available. To disable it without modifying code, simply set skills_enabled=false in config (which also disables all skills globally).

To disable just the periodic job while keeping the fire-and-forget hook, remove the capability_evolution scheduler registration in main.py.