Files
pi-config/extensions/pi-crew/skills/verification-before-done/SKILL.md

2.2 KiB

name, description
name description
verification-before-done Use when about to claim work is complete, fixed, passing, reviewed, committed, or ready to hand off.

verification-before-done

Core principle: evidence before claims. A worker report, green-looking log, or previous run is not fresh verification.

Distilled from detailed reads of agent-skill patterns for verification-before-completion, TDD, review reception, and QA workflows.

Gate Function

Before any completion claim:

  1. Identify the command or inspection that proves the claim.
  2. Run the full command fresh, or explicitly state why a command cannot be run.
  3. Read the output, including exit code and failure counts.
  4. Compare the output to the claim.
  5. Report the claim only with the evidence.

Claim-to-Evidence Table

Claim Requires Not sufficient
Tests pass Fresh test output with zero failures Prior run, “should pass”
Typecheck passes Typecheck command exit 0 Lint or targeted tests only
Bug fixed Original symptom/regression test passes Code changed
Requirements met Checklist against request/plan Generic test success
Agent completed Worker output plus artifact/diff/state inspection Worker says DONE
Safe to commit Relevant checks pass and status reviewed Partial local confidence

Verification Ladder

Choose the smallest reliable gate, then escalate when risk requires it:

  1. Read-only inspection for plans/reviews.
  2. Targeted unit test for touched behavior.
  3. Typecheck for TypeScript/schema/API changes.
  4. Integration test for runtime, subprocess, state, filesystem, UI, config, or session behavior.
  5. Full suite before commit/release or broad changes.
  6. Real Pi smoke only when safe and needed.

Done Report

Include:

  • changed files or read-only status;
  • commands run and pass/fail result;
  • artifacts, run IDs, logs, or state paths inspected;
  • behavior actually verified;
  • skipped checks and why;
  • risks and rollback notes.

Red Flags

Stop before saying done if you are using words like “should”, “probably”, “looks”, “seems”, “I think”, or if you are trusting an agent report without checking evidence.