技能 产品商业 多模型备忘录共识评估

多模型备忘录共识评估

v20260513
cross-eval
该工具旨在为高风险的战略备忘录或商业简报提供多模型共识评估。它将内容提交给多个先进的AI模型,自动提取模型之间的共识关注点、支持点以及关键分歧意见。特别适用于M&A、重大融资或战略转型等不可逆转的决策分析,确保决策从多个维度得到审视,从而规避单一模型偏差。
获取技能
322 次下载
概览

/cs:cross-eval — Multi-Model Consensus

Command: /cs:cross-eval <memo-or-brief>

Runs the same memo through multiple model providers and reconciles divergences. Use for high-stakes, irreversible decisions where single-model bias is too costly: M&A, major fundraises, layoffs, strategic pivots, regulatory commitments.

Adapted from gstack's /codex cross-review pattern, generalized to business memos instead of code PRs.

When to Run

  • Before signing a term sheet
  • Before announcing a layoff
  • Before committing to a regulated market
  • Before any decision where reversing costs > 6 months of company time
  • When the boardroom vote was split or had a CRITICAL dissent

Models Used (graceful degradation)

The command tries to invoke each available model in order:

  1. Claude (primary, always available) — the boardroom's native voice
  2. Codex / OpenAI (if OPENAI_API_KEY or codex CLI available)
  3. Gemini (if GEMINI_API_KEY or gemini CLI available)

If only Claude is available, the command runs Claude-only with adversarial mode — same model, different prompt seeds — and clearly labels the output as single-model.

Workflow

  1. Read the memo / brief
  2. Probe environment for available model CLIs / API keys
  3. For each available model:
    • Send the memo with this prompt prefix:

      "You are an independent C-suite reviewer. The following is a board memo from another company's boardroom. Identify the top 3 concerns, the top 3 supports, and your vote (APPROVE / REJECT / DEFER). Do not deferentially agree — assume the memo's reasoning is flawed until proven otherwise."

  4. Collect three independent reviews
  5. Reconcile: where do they agree? Where do they diverge?
  6. Surface the divergences as questions for the founder

Output Format

Saved to ~/.claude/cross-eval/YYYY-MM-DD-<slug>.md:

# Cross-Eval: <memo title>
**Date:** YYYY-MM-DD
**Memo reviewed:** <link>
**Models invoked:** Claude / Codex / Gemini (or noted fallbacks)

## Vote Tally
| Model | Vote | Confidence |
|---|---|---|
| Claude | APPROVE | High |
| Codex | DEFER | Med |
| Gemini | APPROVE | Low |

## Consensus Concerns (≥2 models flagged)
1. <concern> — flagged by Claude + Codex
2. <concern> — flagged by all 3

## Divergent Concerns (1 model flagged)
- <Codex only:> <concern> — worth a second look
- <Gemini only:> <concern> — likely noise, but check

## Consensus Supports (≥2 models endorsed)
1. <support>
2. <support>

## Recommendation
- 🟢 GO if 2+ models APPROVE and no CRITICAL concerns from any model
- 🟡 PAUSE if any model is DEFER or any concern is CRITICAL
- 🔴 STOP if 2+ models REJECT

## Open Questions for Founder
1. <question raised by divergence>
2. <question raised by divergence>

Why This Matters

Single-model recommendations have systematic biases. Claude trends helpful and may under-weight risk. Codex (OpenAI) trends more cautious on emerging-market and regulatory topics. Gemini trends more cautious on technical scale claims. Disagreement is signal, not noise.

This is the safety net before irreversibility — not a replacement for outside counsel or a real board.

Graceful Degradation

If only Claude is available:

**Models available:** Claude only
**Mode:** ADVERSARIAL — running 3 independent Claude passes with different system prompts:
  1. Standard reviewer
  2. Devil's advocate (must find 3 critical concerns)
  3. Steelman (must find 3 strongest reasons to approve)

This is weaker than true multi-model. Treat the result as suggestive, not conclusive.

Routing

  • /cs:decide — if consensus is GO
  • /cs:freeze — if consensus is PAUSE
  • /cs:boardroom (re-run) — if consensus is STOP

Related


Version: 1.0.0

信息
Category 产品商业
Name cross-eval
版本 v20260513
大小 4.08KB
更新时间 2026-05-14
语言