Hybrid Research Router and Specialist Expert System

v20260517

research

This skill serves as a sophisticated research orchestrator, functioning as a hybrid router. It deterministically classifies complex user queries and routes them either to specialized domain modules (e.g., trend analysis, academic literature review, patent search, or entity dossiers) or executes a robust fallback general search workflow. The system always maintains transparency regarding the routing decision, yielding comprehensive, cited reports in markdown or .docx format.

Research Orchestration Classification Specialist Literature Patent Dossier Analysis

Get Skill

440 downloads

Overview

Research — Hybrid Router + Fallback

The runtime orchestrator for the research domain. Architecture C: deterministic classification → specialist delegation OR own plan-decompose-search-synthesize-cite workflow.

Portability

Requires WebSearch + WebFetch for the fallback workflow; specialist skills (pulse, grants, litreview, syllabus, patent, dossier) must be present for delegation to work. Node.js with docx package required if Q2 = document mode. Works in Claude Code CLI natively. In Claude.ai with web tools + Code Execution, the workflow is supported.

Distinct From `engineering/autoresearch-agent`

These two skills share the word "research" but serve completely different use cases:

research/research/ (this skill) — research-query router + fallback workflow ("Research X")
engineering/autoresearch-agent/ — Karpathy's autonomous file-optimization experiment loop ("Make this code faster")

No overlap. They coexist.

Hybrid Architecture (C)

Every invocation produces one of three outcomes:

Delegation — Classified as specialist-domain. Routes there. User sees the specialist's output.
Fallback execution — Classified as general research. Runs own plan → search → synthesize workflow.
Clarification request — Classification ambiguous. Asks one forcing question to disambiguate, then routes.

The skill never silently runs its fallback when a specialist would have done better. Routing transparency is what makes the hybrid architecture trustworthy.

Specialist Registry

Specialist	Routing signals	Domain
`pulse`	reddit / hn / x / buzz / sentiment / trending / "what's people saying" / "pulse on" / "take the pulse" / "current conversation"	Multi-source recency research
`grants`	NIH / grant / R01 / K-award / RePORTER / NOSI / "grants for" / FDA / "study section" / "principal investigator"	NIH grant-funding intelligence
`litreview`	literature review / PICO / SPIDER / systematic review / "review papers on" / meta-analysis	Academic literature orientation
`syllabus`	syllabus / course outline / curriculum / "reading list" / "for my class" / "for my students"	Course supplementary reading
`patent`	prior art / FTO / freedom to operate / patent / "patent landscape" / invention / novelty search / "ip landscape"	Patent prior-art + landscape
`dossier`	"dossier on" / "due diligence" / "background check" / "prep me for" / "competitor research" / "investor diligence" / "interview prep" / "background on"	Decision-grade entity research

Agent Integrity Rules

This skill obeys the research-pack convention:

Execution discipline (fallback only): Sequential searches. 1 q/sec rate limit. Confirm response received before next call.
Source discipline: Cite only sources returned by this session's tool calls. Training knowledge labeled [Background — not from search] and excluded from counts.
Three-count tracking (fallback only): Queries sent / sources received / sources cited.
Retry policy: On failure → wait 3s → retry once → log. After 3 consecutive failures: stop, alert user.
Plan-tier detection: If delegated to Consensus-using specialist, that specialist handles detection. In fallback mode, surface any rate-limit signals.
Routing discipline: Never delegate silently. Always state the decision + accept override.

Phase 1: Grill-Me Intake (2–4 Questions)

Intake is intentionally minimal — the goal is to route fast, not to interrogate. One question per turn.

Q1 (always) — Research question

What's the research question? State it in 1–2 sentences. Specific is better than broad — "AI for healthcare" gets you a vague survey; "How are health systems integrating LLM-based clinical decision support in 2026?" gets you a useful answer.

Why I'm asking: Specificity dictates classification accuracy and search precision. A vague question routes to fallback; a specific question often matches a specialist cleanly.

Refuse mush. If user says "research AI", push back once: "What about AI specifically — adoption, safety, capability, funding, regulation, comparison? Pick an angle."

Q2 (always) — Output preference

What output do you want? Pick one:

Quick chat briefing (5-min read, markdown in chat)

Standalone document (.docx with citations, shareable)

Why I'm asking: Document mode triggers deeper search budgets and full audit logs. Chat mode optimizes for fast delivery.

Forcing choice.

Q3 (asked only if classification ambiguous — ≤1 signal) — Domain disambiguation

Quick clarification — pick the closest match:

Academic literature (papers, peer-reviewed)

Industry / trends (what's the buzz, news, sentiment)

Specific entity (a company, person, organization)

Technology / patents (prior art, IP landscape)

Grant funding (NIH, foundations)

Course material (syllabus or curriculum)

None of the above — run general research

Why I'm asking: I couldn't classify confidently from your question alone. This routes you to the right specialist or confirms general-research fallback.

Skip if Q1 + Q2 produced clear specialist match (≥2 signals).

Q4 (asked only if Q3 was needed AND user picked "none of the above") — General-research scope

For general research, what's your time horizon — quick scan (5 searches) or thorough (15 searches)?

Why I'm asking: General research has no specialist budget; you pick it. Quick is good for "what's the lay of the land". Thorough is for "I'll make a decision based on this".

Skip if a specialist took over.

Stop condition: After Q4 (or earlier if dependency skips applied), commit and start Phase 2. Most invocations exit intake after Q1 + Q2.

Phase 2: Deterministic Classification

This is deterministic, not LLM-reasoned — for speed, debuggability, and consistency.

SIGNALS = {
  pulse:    ["reddit", "hn", "hacker news", "x.com", "twitter", "buzz",
             "sentiment", "trending", "what are people saying",
             "what's happening", "the conversation around",
             "pulse on", "take the pulse", "current conversation"],
  grants:   ["nih", "grant", "grants for", "r01", "r21", "k-award", "reporter",
             "nosi", "funding", "fda", "study section", "principal investigator"],
  litreview:["literature review", "lit review", "litreview", "pico", "spider",
             "systematic review", "review papers on", "research papers on",
             "papers about", "meta-analysis"],
  syllabus: ["syllabus", "course outline", "curriculum", "reading list",
             "for my class", "for my students", "course material"],
  patent:   ["prior art", "fto", "freedom to operate", "patent",
             "patent landscape", "invention", "novelty search",
             "patent search", "ip landscape"],
  dossier:  ["dossier on", "due diligence", "background check",
             "prep me for", "competitor research", "investor diligence",
             "interview prep", "research my competitor", "background on"]
}

# Signals are case-insensitive literal phrases (multi-word substring match).
# Bracketed placeholders (e.g., "research [company]") are intentionally NOT
# signals — they over-trigger on generic "research X" queries that should
# fall back to general research, not auto-route to dossier. Specific phrases
# pair the verb with the noun ("dossier on", "background on") and route reliably.

For each specialist S:
  score[S] = count of SIGNALS[S] phrases matched in question (case-insensitive substring)

if max(score) >= 2:
  route_to = argmax(score)                                  # high confidence
elif max(score) == 1 and only one specialist has score 1:
  route_to = that specialist                                # weak match, single specialist
else:
  route_to = "fallback"                                     # ambiguous or no match — ask Q3

Implementation: scripts/classifier.py --question "..." returns the routing decision + matched signals + per-specialist scores. Use it; don't re-implement.

Phase 3a: Specialist Delegation (≥2 signals OR single weak match)

When delegating:

Pass the user's question verbatim plus the output preference (Q2)
Let the specialist run its own grill-me intake — do NOT pre-answer specialist questions
Return specialist output as the user-visible result
Tag the result with [Delegated to: research → {specialist}] in the chat output so the user knows what skill produced it
Tag the audit log via scripts/routing_transparency_logger.py --action record_delegation

Phase 3b: Own Fallback Workflow

If routing produced no specialist match, run the 8-step fallback.

Step 1: Decompose

Break the research question into 3–5 sub-questions. Use the framework: what / why / how / who / what's next. Show the decomposition to the user before searching. Use scripts/fallback_decomposer.py --question "..." for a deterministic starting point.

Step 2: Source Selection

For each sub-question, choose source(s) deterministically:

Recency-sensitive → WebSearch + WebFetch + (optionally Reddit/HN if signal)
Technical specs / docs → WebSearch + WebFetch
Academic → Consensus MCP if connected; otherwise WebSearch with scholar.google.com site filter
Data / numbers → WebSearch for sources; then WebFetch for primary documents
Person / company entity-level → consider routing to dossier (offer override)

Step 3: Search

Sequential per sub-question. 1 q/sec etiquette. Per source: 2–4 queries, broad-to-narrow.

Step 4: Read + Extract

For each result that looks high-signal: WebFetch and extract the relevant section. Note the source URL.

Step 5: Synthesize

Per sub-question: 2–4 paragraphs answering it with inline citations. Surface disagreement when sources disagree.

Step 6: Cross-Cutting Patterns

After per-sub-question synthesis: 1–2 paragraphs of patterns across sub-questions — consensus, controversy, gaps.

Step 7: Output

Markdown brief by default (Q2 choice). DOCX if user picked document mode.

Step 8: Audit Log

Three-count summary (sent / received / cited) + per-source list with reliability tier (primary / secondary / tertiary).

Routing Transparency Protocol (Mandatory)

After classification, the skill always:

States the decision in one sentence: "Routing to litreview because you mentioned PICO and meta-analysis (2 signals)."
Offers override: "If you want general research instead OR a different specialist, say so now. Otherwise proceeding in 5 seconds."
Waits 1 turn for confirmation (or auto-proceeds after 5s in interactive contexts).
If user overrides → accept, re-route, log the override via routing_transparency_logger.py --action record_override.

Never delegates silently. This is the trust-building property that makes the hybrid pattern work.

Output Format

Markdown brief (Q2 = quick chat briefing)

# [Research Question] — Briefing
*Generated: [DATE] | Routed: [delegated specialist | fallback]*

## TL;DR
[2-3 sentences]

## Findings
### [Sub-question 1]
[2-4 paragraphs with inline citations]

### [Sub-question 2]
...

## Cross-Cutting Patterns
[1-2 paragraphs]

## Sources
[Numbered list with hyperlinks, reliability tier per source]

## Audit
[Three counts + per-source tier + failures]

DOCX (Q2 = standalone document)

Use the standard research-pack DOCX patterns: Arial 12pt, navy headings, blue table headers, hyperlinked sources, mandatory audit log section. Reference the docx skill for setup.

Audit Log Requirement (Fallback Mode)

Queries sent:        N
Sources received:    M
Sources cited:       K
Failures:            F (3-consecutive-failures triggered: yes/no)
Per-source tier:     [URL — primary | secondary | tertiary]
Routing decision:    fallback (no specialist matched)
Sub-questions:       [list]

All routing decisions + overrides also logged to ~/.research_sessions/<session>.json via routing_transparency_logger.py.

Failure Modes

Failure	Behavior
Classification ambiguous (≤1 signal)	Ask Q3 (domain disambiguation).
Specialist delegation fails	Note in chat. Offer to retry or fall back to general research.
User overrides routing	Accept. Re-route to chosen specialist or fallback. Log the override.
Fallback search returns thin results	Surface explicitly. Suggest the question may be too niche or too new. Do not fabricate.
3 consecutive tool failures in fallback	Stop, alert user, share what was collected.
Question is non-research (e.g., "write me code")	Decline politely. Suggest the user invoke an appropriate skill.
Sub-question can't be answered	Note in synthesis as "limited public signal on this"; don't omit silently.
Output format mismatch	Honor Q2 preference; if format unavailable, fall back to markdown with note.
Specialist skill missing from environment	Skip it in classification scoring; route to fallback or next-best specialist.

Anti-Patterns Rejected

LLM-reasoned classification (must be deterministic keyword + intent matching)
Silent delegation (always surface routing decision)
Refusing to route to a specialist when ≥2 signals match
Routing to a specialist when classification is genuinely ambiguous (≤1 signal across all)
Pre-answering the specialist's grill-me intake (let it run its own)
Running fallback when a specialist would clearly do better
Fabricating sources in fallback when search is thin
Skipping audit log in fallback mode
Treating "dossier on [company]" as fallback when dossier is the right specialist (the verb-noun-paired phrase, not the generic "research X" form, is what routes)
Treating "what are people saying about X" as fallback when pulse is the right specialist
Auto-routing generic "research [topic]" queries to a specialist when the user hasn't paired the verb with a specialist-specific noun (e.g., "research Microsoft" alone is ambiguous — could be dossier or general; ask Q3 instead of guessing)

Tooling

Python (stdlib only)

scripts/classifier.py — Deterministic SIGNALS matching → routing decision + per-specialist score + matched phrases. --question "..." --output json.
scripts/routing_transparency_logger.py — JSON-backed audit log at ~/.research_sessions/<session>.json. Records every routing decision, override, and delegation handoff.
scripts/fallback_decomposer.py — Heuristic question → 3–5 sub-questions using what / why / how / who / what's next framework.

Reference Docs (each cites 7+ authoritative sources)

references/hybrid_router_architecture.md — router-vs-run trade-offs + routing transparency principle
references/deterministic_classification_canon.md — why keyword > LLM-reasoned for routing
references/fallback_workflow_canon.md — plan-decompose-search-synthesize methodology

Dependencies

WebSearch + WebFetch — Required for fallback workflow
Specialist skills — Required for delegation: pulse, grants, litreview, syllabus, patent, dossier. If a specialist is missing, the router skips it in classification and routes to fallback instead.
Node.js docx library — Required if user picks document output (Q2 = standalone)
Consensus MCP — Optional; used in fallback if academic sub-questions surface

Trigger Phrases

"research [topic]"
"look into [topic]"
"what do we know about [topic]"
"investigate [topic]"
"find me information on [topic]"
"do some research on [topic]"
"I need to understand [topic]"
Any research request that doesn't obviously match a more-specific specialist

Version: 1.0.0 Source spec: megaprompts/13-research-megaprompt.md Build pattern: Path B (direct conversion)

Info

Category Data Science

Name research

Version v20260517

Size 26.79KB

Source alirezarezvani/claude-skills

Updated At 2026-05-18