Add 5 pi extensions: pi-subagents, pi-crew, rpiv-pi, pi-interactive-shell, pi-intercom
This commit is contained in:
41
extensions/pi-crew/skills/observability-reliability/SKILL.md
Normal file
41
extensions/pi-crew/skills/observability-reliability/SKILL.md
Normal file
@@ -0,0 +1,41 @@
|
||||
---
|
||||
name: observability-reliability
|
||||
description: Metrics, diagnostics, correlation, retry, deadletter, and recovery evidence workflow. Use when adding reliability features or investigating failures.
|
||||
---
|
||||
|
||||
# observability-reliability
|
||||
|
||||
Use this skill for reliability and observability work.
|
||||
|
||||
## Source patterns distilled
|
||||
|
||||
- `src/observability/*` — metric registry, retention, sinks, exporters, event-to-metric mapping
|
||||
- `src/runtime/retry-executor.ts`, `deadletter.ts`, `diagnostic-export.ts`, `recovery-recipes.ts`, `overflow-recovery.ts`, `heartbeat-gradient.ts`
|
||||
- `docs/research-phase9-observability-reliability-plan.md`
|
||||
|
||||
## Rules
|
||||
|
||||
- Metrics should be per-session/per-registry where possible; avoid hidden global singletons.
|
||||
- Use low-cardinality labels. Avoid raw task titles, prompts, full file paths, or secrets in metric labels.
|
||||
- Redact secrets before writing logs, events, diagnostics, agent output, or exported bundles.
|
||||
- Correlate events with runId/taskId and timestamps; include enough context for postmortem without exposing secrets.
|
||||
- Retry should record attempts and deadletter on exhaustion; default auto-retry should remain conservative.
|
||||
- Diagnostics should be safe to share: include state summary, recent events, metrics snapshot when available, and paths to artifacts.
|
||||
- Heartbeat classification should be threshold-based and should ignore terminal tasks/runs.
|
||||
- Overflow recovery should track phase progression and terminal states without repeatedly alerting on completed work.
|
||||
|
||||
## Anti-patterns
|
||||
|
||||
- High-cardinality Prometheus labels.
|
||||
- Emitting duplicate noisy health notifications every render tick.
|
||||
- Writing unredacted Authorization/API key/token values into events or artifacts.
|
||||
- Treating secondary metrics as primary pass/fail unless catastrophic.
|
||||
|
||||
## Verification
|
||||
|
||||
```bash
|
||||
cd pi-crew
|
||||
npx tsc --noEmit
|
||||
node --experimental-strip-types --test test/unit/metric-registry.test.ts test/unit/event-to-metric.test.ts test/unit/diagnostic-export.test.ts test/unit/retry-executor.test.ts test/unit/deadletter.test.ts
|
||||
npm test
|
||||
```
|
||||
Reference in New Issue
Block a user