Skip to main content

Behavioral Learning

The Behavioral Learning Loop enables WASP to learn from user corrections autonomously. When a user tells the agent it made a mistake, the agent analyzes the correction, extracts a behavioral rule, and applies it permanently to all future responses — without requiring manual system prompt updates.

The Learning Loop

User: "You're hallucinating — I never said I wanted Python"


detect_correction() → True (pattern match)


Queue to Redis: behavioral:pending

▼ (BehavioralLearnerJob — every 2 min)
LLM analysis → extract rule


Save to behavioral_rules table


Inject into every future system prompt


Telegram notification: "🧠 New behavioral rule learned"

Correction Detection

_detect_correction() in handlers.py uses pattern matching:

_CORRECTION_PATTERNS = [
r"estás alucinando",
r"por qué dices que no puedes",
r"eso está mal",
r"te equivocas",
r"eso no es correcto",
r"you're wrong",
r"that's not right",
r"stop doing that",
# ... more patterns
]

When a match is found, the message + recent context is queued to Redis.

Rule Types

TypeDescriptionExample
refusalAgent was refusing something it shouldn't"Never refuse to show code examples"
hallucinationAgent invented facts"Always check current price before stating it"
wrong_skillAgent used wrong skill for task"Use python_exec not shell for data processing"
missing_contextAgent ignored available context"Always check memory before claiming you don't know"

Rule Storage

Rules are stored in the behavioral_rules table:

ColumnTypeDescription
idUUIDRule identifier
rule_typestringrefusal, hallucination, wrong_skill, missing_context
rule_textstringHuman-readable rule
context_summarystringWhat triggered the rule
activebooleanWhether rule is applied
created_attimestampWhen learned

Deduplication

Before saving a new rule, existing rules are checked for overlap:

# Skip if >60% word overlap with existing rule
overlap = len(new_words & existing_words) / len(new_words | existing_words)
if overlap > 0.60:
skip_rule()

This prevents rule accumulation from similar corrections.

Prompt Injection

Rules are injected into every system prompt:

[LEARNED BEHAVIORAL RULES]
• [refusal] Never refuse to provide shell commands — user has FULL_AUTONOMY
• [hallucination] Always use web_search before stating current prices
• [wrong_skill] Use python_exec for data parsing, not shell commands

Rules also generate dynamic SKILL_POISON patterns — anti-examples that show the LLM exactly what NOT to do.

Few-Shot Examples

The learning system also maintains positive/negative few-shot examples:

  • Positive examples: Tasks where the agent performed well (stored in learning_examples)
  • Negative examples: Tasks where the agent made mistakes (also stored with lower weight)

These are injected as conversation few-shots to guide the LLM toward good patterns.

Managing Rules

View active rules:

docker exec agent-postgres psql -U agent -d agent -c \
"SELECT rule_type, rule_text, created_at FROM behavioral_rules WHERE active=true ORDER BY created_at DESC;"

Deactivate a rule:

docker exec agent-postgres psql -U agent -d agent -c \
"UPDATE behavioral_rules SET active=false WHERE id='<uuid>';"

Manually queue a correction for processing:

docker exec agent-redis redis-cli RPUSH behavioral:pending '{"message": "Stop hallucinating prices", "context": "..."}'

Configuration

The learning loop runs every 120 seconds. To change:

# src/main.py
scheduler.register("behavioral_learner", 120, BehavioralLearnerJob(...))

Notification

When a new rule is learned, a Telegram message is sent to SCHEDULER_NOTIFY_CHAT_ID:

🧠 Nueva regla de comportamiento aprendida:
Tipo: hallucination
Regla: Siempre verificar precio actual antes de declararlo
Contexto: Usuario corrigió precio incorrecto de BTC