技能 编程开发 规格驱动的开发与验证流程

规格驱动的开发与验证流程

v20260622
megabrain
这是一个全面的、自适应的特征开发和实现框架,流程遵循“规格制定-设计-任务分解-执行”的严格多阶段流程。它通过原子任务、需求可追溯性和强制性独立验证来确保软件质量。该框架能够根据需求的复杂程度(从小到复杂)自动调整深度,是管理技术债务和捕获项目经验的最佳实践工具。
获取技能
266 次下载
概览

Tech Lead's Club - Spec-Driven Development

Plan and implement features with precision. Granular tasks. Clear dependencies. Right tools. Zero ceremony.

┌──────────┐   ┌──────────┐   ┌─────────┐   ┌─────────┐
│ SPECIFY  │ → │  DESIGN  │ → │  TASKS  │ → │ EXECUTE │
└──────────┘   └──────────┘   └─────────┘   └─────────┘
   required      optional*      optional*     required

* Agent auto-skips when scope doesn't need it

Critical Rules (read before acting)

Loading this skill's files. Reference files live under references/ in this skill's own directory (where this SKILL.md resides). Resolve them relative to the skill directory — never the workspace root — and load them through the active skill by name; never assume a fixed install path. When a step tells you to read a reference, read it completely (to EOF) before acting — never act on a partial/truncated read.

Execution contract — every task, non-negotiable (holds even if you do not open the reference files):

  1. Tests derive from the spec's acceptance criteria and assert spec-defined outcomes — they never mirror the implementation.
  2. The gate must pass (tests pass) before a task is done — the test runner decides, not self-assessment.
  3. One atomic commit per task. Never batch tasks; never weaken, skip, or delete tests to make them pass.
  4. After the LAST task, a fresh Verifier always runs automatically (author ≠ verifier) — spec-anchored outcome check + discrimination sensor. It is never optional and never prompted. See Sub-Agent Delegation.

Before Execute: read implement.md completely; if a formal tasks.md has more than 3 phases, present the sub-agent offer first (see Sub-Agent Delegation).

Auto-Sizing: The Core Principle

The complexity determines the depth, not a fixed pipeline. Before starting any feature, assess its scope and apply only what's needed:

Scope What Specify Design Tasks Execute
Small ≤3 files, one sentence One-liner spec (inline) Skip Skip Implement + verify inline
Medium Clear feature, <10 tasks Spec (brief) Skip — design inline Skip — tasks implicit Implement + verify
Large Multi-component feature Full spec + requirement IDs Architecture + components Full breakdown + dependencies Implement + verify per task
Complex Ambiguity, new domain Full spec + discuss gray areas Research + architecture Breakdown + parallel plan Implement + interactive UAT

Rules:

  • Specify and Execute are always required — you always need to know WHAT and DO it
  • Design is skipped when the change is straightforward (no architectural decisions, no new patterns)
  • Tasks is skipped when there are ≤3 obvious steps (they become implicit in Execute)
  • Discuss is triggered within Specify when the agent detects ambiguous gray areas that need user input, or when the feature has any implicit-requirement dimension present (persistence/state, external calls, auth, payments, concurrency, state transitions)
  • Interactive UAT is triggered within Execute only for user-facing features with complex behavior

Safety valve: Even when Tasks is skipped, Execute ALWAYS starts by listing atomic steps inline (see implement.md). If that listing reveals >5 steps or complex dependencies, STOP and create a formal tasks.md — the Tasks phase was wrongly skipped.

.specs Structure

.specs/
├── STATE.md            # Project memory: Decisions log (AD-NNN) + Handoff snapshot
├── LESSONS.md          # Self-improving lessons playbook (rendered by scripts/lessons.py — do not hand-edit)
├── lessons.json        # Canonical lessons state (machine-owned)
└── features/           # Feature specifications
    └── [feature]/
        ├── spec.md         # Requirements with traceable IDs
        ├── context.md      # User decisions for gray areas (only when discuss is triggered)
        ├── design.md       # Architecture & components (only for Large/Complex)
        ├── tasks.md        # Atomic tasks with verification (only for Large/Complex)
        └── validation.md   # Verifier report: PASS/FAIL, per-AC evidence, sensor result, diff range

Workflow

New feature:

  1. Specify → (Design) → (Tasks) → Execute (depth auto-sized)

Resume work:

Read .specs/STATE.md — Handoff section for in-flight state, Decisions section to re-confirm active constraints — then propose the next step.

Context Loading Strategy

On-demand load (only what the current task needs):

  • .specs/STATE.md — Decisions section (read at Design, re-read on resume); Handoff section (read on resume only)
  • confirmed lessons — load at Specify and Design via python3 scripts/lessons.py list --status confirmed (lessons.md); confirmed only, never candidates
  • spec.md (when working on a specific feature)
  • context.md (when designing or implementing from user decisions)
  • design.md (when implementing from design)
  • tasks.md (when executing tasks)

Never load simultaneously:

  • Multiple feature specs
  • Multiple architecture docs

Target: <40k tokens total context Reserve: 160k+ tokens for work, reasoning, outputs Monitoring: Display status when >40k (see context-limits.md)

Sub-Agent Delegation

Trigger: >3 phases → offer one worker per phase (sequential); ≤3 phases → execute inline.

Offer-then-confirm — never auto-spawn. The user must accept before any sub-agent is dispatched.

One worker per phase: Each phase worker executes all its tasks in order (implement → gate → atomic commit), then reports a compact summary (tasks done, commit hashes, test counts, deviations). Workers never spawn further sub-agents.

Verifier (always-on, never prompted): After the final task is committed, the orchestrator dispatches a fresh Verifier sub-agent automatically — regardless of phase count. Validation never requires a user prompt; it is the closing step of Execute. Author ≠ verifier: the Verifier re-derives coverage independently using evidence-or-zero; it does not inherit the author's mental model. The Verifier: (1) performs a spec-anchored outcome check — confirms each test's asserted value matches the spec-defined expected outcome, flags spec-precision gaps; (2) runs a discrimination sensor — injects behavior-level faults in scratch state, confirms tests kill them, discards mutations, surviving mutants become fix tasks; (3) writes .specs/features/[feature]/validation.md (PASS/FAIL, per-AC evidence, sensor result, diff range); (4) returns a compact verdict + ranked gap list to the orchestrator in chat. Gaps become fix tasks; the fix→re-verify loop is bounded to 3 iterations before escalating. (5) distills lessons — turns each grounded failure (surviving mutant, spec-precision gap, failed AC, SPEC_DEVIATION) into a reusable project-local lesson via scripts/lessons.py; a clean PASS records nothing (see lessons.md).

Standalone fallback: Without sub-agents, run validate.md as an independent fresh-eyes pass after the final commit — including the spec-anchored check and discrimination sensor.

Full mechanics (worker payload, compact summary format, failure handling, context sizing, Verifier report format): sub-agents.md.

Commands

Feature-level (auto-sized):

Trigger Pattern Reference
Specify feature, define requirements specify.md
Discuss feature, capture context, how should this work discuss.md
Design feature, architecture design.md
Break into tasks, create tasks tasks.md
Implement task, build, execute implement.md
Validate, verify, test, UAT, walk me through it validate.md

Memory:

Trigger Pattern Reference
Record decision, this is a project-level decision memory.md
Pause work, end session, I need to stop memory.md
Resume work, continue, pick up where we left off memory.md
Load lessons, what have we learned, apply past lessons lessons.md
Record lesson, distill lessons (auto-runs after validation) lessons.md

Knowledge Verification Chain

When researching, designing, or making any technical decision, follow this chain in strict order. Never skip steps.

Step 1: Codebase → check existing code, conventions, and patterns already in use
Step 2: Project docs → README, docs/, inline comments, `.specs/STATE.md` (Decisions)
Step 3: Context7 MCP → resolve library ID, then query for current API/patterns
Step 4: Web search → official docs, reputable sources, community patterns
Step 5: Flag as uncertain → "I'm not certain about X — here's my reasoning, but verify"

Rules:

  • Never skip to Step 5 if Steps 1-4 are available
  • Step 5 is ALWAYS flagged as uncertain — never presented as fact
  • NEVER assume or fabricate. If you cannot find an answer, say "I don't know" or "I couldn't find documentation for this". Inventing APIs, patterns, or behaviors causes cascading failures across design → tasks → implementation. Uncertainty is always preferable to fabrication.

Output Behavior

Model guidance: After completing lightweight tasks (validation, feature-level checks), naturally mention once per session that such tasks work well with faster/cheaper models. For heavy tasks (complex design, large features), briefly note the reasoning requirements before starting.

Be conversational, not robotic. Don't interrupt workflow—add as a natural closing note. Skip if user seems experienced or has already acknowledged the tip.

Code Analysis

Use available tools with graceful degradation. See code-analysis.md.

信息
Category 编程开发
Name megabrain
版本 v20260622
大小 52.77KB
更新时间 2026-06-23
语言