规格驱动的开发与验证流程

v20260622

megabrain

这是一个全面的、自适应的特征开发和实现框架，流程遵循“规格制定-设计-任务分解-执行”的严格多阶段流程。它通过原子任务、需求可追溯性和强制性独立验证来确保软件质量。该框架能够根据需求的复杂程度（从小到复杂）自动调整深度，是管理技术债务和捕获项目经验的最佳实践工具。

开发功能规划验证敏捷软件工程需求

获取技能

266 次下载

概览

Tech Lead's Club - Spec-Driven Development

Plan and implement features with precision. Granular tasks. Clear dependencies. Right tools. Zero ceremony.

┌──────────┐   ┌──────────┐   ┌─────────┐   ┌─────────┐
│ SPECIFY  │ → │  DESIGN  │ → │  TASKS  │ → │ EXECUTE │
└──────────┘   └──────────┘   └─────────┘   └─────────┘
   required      optional*      optional*     required

* Agent auto-skips when scope doesn't need it

Critical Rules (read before acting)

Loading this skill's files. Reference files live under references/ in this skill's own directory (where this SKILL.md resides). Resolve them relative to the skill directory — never the workspace root — and load them through the active skill by name; never assume a fixed install path. When a step tells you to read a reference, read it completely (to EOF) before acting — never act on a partial/truncated read.

Execution contract — every task, non-negotiable (holds even if you do not open the reference files):

Tests derive from the spec's acceptance criteria and assert spec-defined outcomes — they never mirror the implementation.
The gate must pass (tests pass) before a task is done — the test runner decides, not self-assessment.
One atomic commit per task. Never batch tasks; never weaken, skip, or delete tests to make them pass.
After the LAST task, a fresh Verifier always runs automatically (author ≠ verifier) — spec-anchored outcome check + discrimination sensor. It is never optional and never prompted. See Sub-Agent Delegation.

Before Execute: read implement.md completely; if a formal tasks.md has more than 3 phases, present the sub-agent offer first (see Sub-Agent Delegation).

Auto-Sizing: The Core Principle

The complexity determines the depth, not a fixed pipeline. Before starting any feature, assess its scope and apply only what's needed:

Scope	What	Specify	Design	Tasks	Execute
Small	≤3 files, one sentence	One-liner spec (inline)	Skip	Skip	Implement + verify inline
Medium	Clear feature, <10 tasks	Spec (brief)	Skip — design inline	Skip — tasks implicit	Implement + verify
Large	Multi-component feature	Full spec + requirement IDs	Architecture + components	Full breakdown + dependencies	Implement + verify per task
Complex	Ambiguity, new domain	Full spec + discuss gray areas	Research + architecture	Breakdown + parallel plan	Implement + interactive UAT

Rules:

Specify and Execute are always required — you always need to know WHAT and DO it
Design is skipped when the change is straightforward (no architectural decisions, no new patterns)
Tasks is skipped when there are ≤3 obvious steps (they become implicit in Execute)
Discuss is triggered within Specify when the agent detects ambiguous gray areas that need user input, or when the feature has any implicit-requirement dimension present (persistence/state, external calls, auth, payments, concurrency, state transitions)
Interactive UAT is triggered within Execute only for user-facing features with complex behavior

Safety valve: Even when Tasks is skipped, Execute ALWAYS starts by listing atomic steps inline (see implement.md). If that listing reveals >5 steps or complex dependencies, STOP and create a formal tasks.md — the Tasks phase was wrongly skipped.

.specs Structure

.specs/
├── STATE.md            # Project memory: Decisions log (AD-NNN) + Handoff snapshot
├── LESSONS.md          # Self-improving lessons playbook (rendered by scripts/lessons.py — do not hand-edit)
├── lessons.json        # Canonical lessons state (machine-owned)
└── features/           # Feature specifications
    └── [feature]/
        ├── spec.md         # Requirements with traceable IDs
        ├── context.md      # User decisions for gray areas (only when discuss is triggered)
        ├── design.md       # Architecture & components (only for Large/Complex)
        ├── tasks.md        # Atomic tasks with verification (only for Large/Complex)
        └── validation.md   # Verifier report: PASS/FAIL, per-AC evidence, sensor result, diff range

Workflow

New feature:

Specify → (Design) → (Tasks) → Execute (depth auto-sized)

Resume work:

Read .specs/STATE.md — Handoff section for in-flight state, Decisions section to re-confirm active constraints — then propose the next step.

Context Loading Strategy

On-demand load (only what the current task needs):

.specs/STATE.md — Decisions section (read at Design, re-read on resume); Handoff section (read on resume only)
confirmed lessons — load at Specify and Design via python3 scripts/lessons.py list --status confirmed (lessons.md); confirmed only, never candidates
spec.md (when working on a specific feature)
context.md (when designing or implementing from user decisions)
design.md (when implementing from design)
tasks.md (when executing tasks)

Never load simultaneously:

Multiple feature specs
Multiple architecture docs

Target: <40k tokens total context Reserve: 160k+ tokens for work, reasoning, outputs Monitoring: Display status when >40k (see context-limits.md)

Sub-Agent Delegation

Trigger: >3 phases → offer one worker per phase (sequential); ≤3 phases → execute inline.

Offer-then-confirm — never auto-spawn. The user must accept before any sub-agent is dispatched.

One worker per phase: Each phase worker executes all its tasks in order (implement → gate → atomic commit), then reports a compact summary (tasks done, commit hashes, test counts, deviations). Workers never spawn further sub-agents.

Verifier (always-on, never prompted): After the final task is committed, the orchestrator dispatches a fresh Verifier sub-agent automatically — regardless of phase count. Validation never requires a user prompt; it is the closing step of Execute. Author ≠ verifier: the Verifier re-derives coverage independently using evidence-or-zero; it does not inherit the author's mental model. The Verifier: (1) performs a spec-anchored outcome check — confirms each test's asserted value matches the spec-defined expected outcome, flags spec-precision gaps; (2) runs a discrimination sensor — injects behavior-level faults in scratch state, confirms tests kill them, discards mutations, surviving mutants become fix tasks; (3) writes .specs/features/[feature]/validation.md (PASS/FAIL, per-AC evidence, sensor result, diff range); (4) returns a compact verdict + ranked gap list to the orchestrator in chat. Gaps become fix tasks; the fix→re-verify loop is bounded to 3 iterations before escalating. (5) distills lessons — turns each grounded failure (surviving mutant, spec-precision gap, failed AC, SPEC_DEVIATION) into a reusable project-local lesson via scripts/lessons.py; a clean PASS records nothing (see lessons.md).

Standalone fallback: Without sub-agents, run validate.md as an independent fresh-eyes pass after the final commit — including the spec-anchored check and discrimination sensor.

Full mechanics (worker payload, compact summary format, failure handling, context sizing, Verifier report format): sub-agents.md.

Commands

Feature-level (auto-sized):

Trigger Pattern	Reference
Specify feature, define requirements	specify.md
Discuss feature, capture context, how should this work	discuss.md
Design feature, architecture	design.md
Break into tasks, create tasks	tasks.md
Implement task, build, execute	implement.md
Validate, verify, test, UAT, walk me through it	validate.md

Memory:

Trigger Pattern	Reference
Record decision, this is a project-level decision	memory.md
Pause work, end session, I need to stop	memory.md
Resume work, continue, pick up where we left off	memory.md
Load lessons, what have we learned, apply past lessons	lessons.md
Record lesson, distill lessons (auto-runs after validation)	lessons.md

Knowledge Verification Chain

When researching, designing, or making any technical decision, follow this chain in strict order. Never skip steps.

Step 1: Codebase → check existing code, conventions, and patterns already in use
Step 2: Project docs → README, docs/, inline comments, `.specs/STATE.md` (Decisions)
Step 3: Context7 MCP → resolve library ID, then query for current API/patterns
Step 4: Web search → official docs, reputable sources, community patterns
Step 5: Flag as uncertain → "I'm not certain about X — here's my reasoning, but verify"

Rules:

Never skip to Step 5 if Steps 1-4 are available
Step 5 is ALWAYS flagged as uncertain — never presented as fact
NEVER assume or fabricate. If you cannot find an answer, say "I don't know" or "I couldn't find documentation for this". Inventing APIs, patterns, or behaviors causes cascading failures across design → tasks → implementation. Uncertainty is always preferable to fabrication.

Output Behavior

Model guidance: After completing lightweight tasks (validation, feature-level checks), naturally mention once per session that such tasks work well with faster/cheaper models. For heavy tasks (complex design, large features), briefly note the reasoning requirements before starting.

Be conversational, not robotic. Don't interrupt workflow—add as a natural closing note. Skip if user seems experienced or has already acknowledged the tip.

Code Analysis

Use available tools with graceful degradation. See code-analysis.md.

信息

Category 编程开发

Name megabrain

版本 v20260622

大小 52.77KB

Source tech-leads-club/agent-skills

更新时间 2026-06-23