Skip to main content

Monitoring

WASP has multiple built-in monitoring systems that track health, cognitive load, and system integrity.

Health Monitor (src/health/monitor.py)

The HealthMonitor runs checks every 5 minutes via HealthCheckJob:

Checks Performed

CheckMethodThreshold
RedisPING commandMust respond
PostgreSQLSimple SELECTMust respond
OllamaGET /api/tagsMust respond (if configured)
Diskshutil.disk_usage()Warning >80%, Critical >95%
RAMpsutil.virtual_memory()Warning >80%, Critical >95%
CPUpsutil.cpu_percent()Warning >85%

Health Score

The health score (0-100) is calculated as:

  • Start at 100
  • -30 for critical issues (Redis down, Postgres down)
  • -15 for warnings (high disk/memory)
  • -5 for minor issues (Ollama down, high CPU)

Stored in Redis at health:last_check as JSON.

Self-Healer (src/health/repair.py)

When issues are detected, SelfHealer takes action:

IssueAction
Disk >95%Delete old logs, temp files, clear browser cache
Ollama unreachableSend restart command via broker
PostgreSQL reconnectExponential backoff retry (via tenacity)
Redis reconnectExponential backoff retry
# View self-healer log
docker compose logs agent-core 2>&1 | grep "self_healer"

Cognitive Pressure Index (CPI)

The CPI is a composite metric (0-100) measuring overall agent cognitive load.

CPI Components

ComponentWeightSource
Active goals count20%GoalOrchestrator
Error rate (24h)25%Audit log
Average skill latency20%Audit log
Memory growth rate15%MemoryManager
CPU usage20%psutil

CPI Thresholds

  • CPI < 50: Normal operation
  • CPI 50-80: Elevated load, monitoring active
  • CPI > 80: High load — agent:cpi_high flag set in Redis (TTL 10min)

When agent:cpi_high is set:

  • AutonomousGoalGeneratorJob skips its run
  • DreamJob skips its run
  • BackgroundPerceptionJob skips its run

This prevents additional load when the system is already stressed.

Monitor the CPI

# Current CPI value
docker exec agent-redis redis-cli GET agent:cpi_current

# Check if high CPI flag is set
docker exec agent-redis redis-cli GET agent:cpi_high

In the dashboard: visible as a banner in the Cognitive tab when CPI > 80.

Self-Integrity Monitor (src/scheduler/integrity.py)

SelfIntegrityMonitorJob runs every 6 hours and cross-checks the self-model against actual performance data:

Checks Performed

  1. Strength accuracy — Are claimed strengths backed by actual skill success rates?
  2. Failure acknowledgment — Does the self-model reflect actual failure patterns?
  3. Epistemic drift — Has domain confidence diverged from actual performance?
  4. Audit error spikes — Are there unusual error patterns in the audit log?

Integrity Report

Results are stored in Redis at agent:integrity_report:

{
"timestamp": "2026-03-09T12:00:00Z",
"score": 87,
"issues": [
"browser skill claimed as strength but 42% success rate (threshold: 60%)",
"epistemic.programming confidence 0.90 vs actual 0.85 (drift: 0.05)"
],
"recommendations": [
"Update self-model: browser skill should be in known_failures",
"Reduce programming confidence to 0.85"
]
}

View the report:

docker exec agent-redis redis-cli GET agent:integrity_report | python3 -m json.tool

Introspector (src/health/introspection.py)

The Introspector generates performance reports on demand:

/introspect

Returns:

  • Health score and component breakdown
  • Skill usage statistics (last 24h)
  • Top skills by call count
  • Average response time
  • Error rate
  • Memory usage

Metrics Collector (src/observability/metrics.py)

Tracks operational metrics in Redis:

  • Request counts per chat_id
  • Token usage per model
  • Skill call counts per skill
  • Error counts per error type
# View metrics
docker exec agent-redis redis-cli HGETALL metrics:requests
docker exec agent-redis redis-cli HGETALL metrics:tokens

Economics Tracker (src/observability/economics.py)

Tracks API cost estimates:

  • Cost per model per request (based on token counts and known pricing)
  • Daily and weekly cost summaries
# View economics data
docker exec agent-redis redis-cli HGETALL economics:daily

Dashboard Monitoring

The /health page in the dashboard shows:

  • Real-time health score with component breakdown
  • Disk/RAM/CPU gauges
  • Last 10 health check results
  • Self-healer action log

The /cognitive page shows:

  • CPI value and component breakdown
  • Epistemic state per domain
  • Self-model strengths and failures
  • Integrity report findings

Setting Up Alerts

For external alerting, monitor the health check output:

# Script to alert if health score drops
SCORE=$(docker exec agent-redis redis-cli GET health:score)
if [ "$SCORE" -lt "70" ]; then
# Send alert to your monitoring system
curl -X POST https://your-alerting-system/alert \
-d "{\"message\": \"WASP health score: $SCORE\"}"
fi