A production-tested pattern for coordinating multiple AI agents through a single orchestrator. Instead of letting agents work independently (and conflict), one orchestrator decomposes tasks, routes them to specialists, prevents duplicate work, and verifies results before marking anything done. Battle-tested across 10,000+ tasks over 6 months.
The orchestrator must know what it IS and what it IS NOT. This prevents it from doing work instead of delegating:
You are the Task Orchestrator. You NEVER do specialized work yourself.
You decompose tasks, delegate to the right agent, prevent conflicts,
and verify quality before marking anything done.
WHAT YOU ARE NOT:
- NOT a code writer — delegate to code agents
- NOT a researcher — delegate to research agents
- NOT a tester — delegate to test agents
This "NOT-block" pattern reduces task drift by ~35% in production.
Before assigning work, check if anyone is already doing this task:
import sqlite3
from difflib import SequenceMatcher
def check_duplicate(description, threshold=0.55):
conn = sqlite3.connect("task_registry.db")
c = conn.cursor()
c.execute("SELECT id, description, agent, status FROM tasks WHERE status IN ('pending', 'in_progress')")
for row in c.fetchall():
ratio = SequenceMatcher(None, description.lower(), row[1].lower()).ratio()
if ratio >= threshold:
return {"id": row[0], "description": row[1], "agent": row[2]}
return None
Use keyword scoring to match tasks to the best agent:
AGENTS = {
"code-architect": ["code", "implement", "function", "bug", "fix", "refactor", "api"],
"security-reviewer": ["security", "vulnerability", "audit", "cve", "injection"],
"researcher": ["research", "compare", "analyze", "benchmark", "evaluate"],
"doc-writer": ["document", "readme", "explain", "tutorial", "guide"],
"test-engineer": ["test", "coverage", "unittest", "pytest", "spec"],
}
def route_task(description):
scores = {}
for agent, keywords in AGENTS.items():
scores[agent] = sum(1 for kw in keywords if kw in description.lower())
return max(scores, key=scores.get) if max(scores.values()) > 0 else "code-architect"
Agent output is a CLAIM. Test output is EVIDENCE.
After agent reports completion:
1. Were files actually modified? (git diff --stat)
2. Do tests pass? (npm test / pytest)
3. Were secrets introduced? (grep for API keys, tokens)
4. Did the build succeed? (npm run build)
5. Were only intended files touched? (scope check)
Mark done ONLY after ALL checks pass.
Every 30 minutes, ask:
1. "What have I DELEGATED in the last 30 minutes?"
2. If nothing → open the task backlog and assign the next task
3. Check for idle agents (no message in >30min on assigned task)
4. Relance idle agents or reassign their tasks
[ORCHESTRATOR -> code-architect] TASK: Add rate limiting to /api/users
SCOPE: src/middleware/rate-limit.ts only
VERIFICATION: npm test -- --grep "rate-limit"
DEADLINE: 30 minutes
User asks: "Fix the login bug"
Registry check: Task #47 "Fix authentication bug" is IN_PROGRESS by security-reviewer
Decision: SKIP — similar task already assigned (78% match)
Action: Notify user of existing task, wait for completion
Problem: Orchestrator starts doing work instead of delegating Solution: Add explicit NOT-blocks and role boundaries
Problem: Two agents modify the same file simultaneously Solution: Task registry with file-level locking and queue system
Problem: Agent claims "done" without actual changes Solution: Quality gate checks git diff before accepting completion
Problem: Tasks pile up without progress Solution: 30-minute heartbeat catches stale assignments and reassigns
@code-review - For reviewing code changes after delegation@test-driven-development - For ensuring quality in agent output@project-management - For tracking multi-agent project progress