Intercept the response flow to offer the user a choice about response depth before Claude answers.
Do not trigger when: user already set a level this session (maintain it silently), or the answer is trivially one line.
Use the repository's canonical context-budget heuristics to estimate the prompt's token count mentally.
Use the same calibration guidance as context-budget:
words × 1.3
chars / 4
For mixed content, use the dominant content type and keep the estimate heuristic.
Classify the prompt, then apply the multiplier range to get the full response window:
| Complexity | Multiplier range | Example prompts |
|---|---|---|
| Simple | 3× – 8× | "What is X?", yes/no, single fact |
| Medium | 8× – 20× | "How does X work?" |
| Medium-High | 10× – 25× | Code request with context |
| Complex | 15× – 40× | Multi-part analysis, comparisons, architecture |
| Creative | 10× – 30× | Stories, essays, narrative writing |
Response window = input_tokens × mult_min to input_tokens × mult_max (but don’t exceed your model’s configured output-token limit).
Present this block before answering, using the actual estimated numbers:
Analyzing your prompt...
Input: ~[N] tokens | Type: [type] | Complexity: [level] | Language: [lang]
Choose your depth level:
[1] Essential (25%) -> ~[tokens] Direct answer only, no preamble
[2] Moderate (50%) -> ~[tokens] Answer + context + 1 example
[3] Detailed (75%) -> ~[tokens] Full answer with alternatives
[4] Exhaustive (100%) -> ~[tokens] Everything, no limits
Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")
Precision: heuristic estimate ~85-90% accuracy (±15%).
Level token estimates (within the response window):
min + (max - min) × 0.25
min + (max - min) × 0.50
min + (max - min) × 0.75
max
| Level | Target length | Include | Omit |
|---|---|---|---|
| 25% Essential | 2-4 sentences max | Direct answer, key conclusion | Context, examples, nuance, alternatives |
| 50% Moderate | 1-3 paragraphs | Answer + necessary context + 1 example | Deep analysis, edge cases, references |
| 75% Detailed | Structured response | Multiple examples, pros/cons, alternatives | Extreme edge cases, exhaustive references |
| 100% Exhaustive | No restriction | Everything — full analysis, all code, all perspectives | Nothing |
If the user already signals a level, respond at that level immediately without asking:
| What they say | Level |
|---|---|
| "1" / "25% depth" / "short version" / "brief answer" / "tldr" | 25% |
| "2" / "50% depth" / "moderate depth" / "balanced answer" | 50% |
| "3" / "75% depth" / "detailed answer" / "thorough answer" | 75% |
| "4" / "100% depth" / "exhaustive answer" / "full deep dive" | 100% |
If the user set a level earlier in the session, maintain it silently for subsequent responses unless they change it.
This skill uses heuristic estimation — no real tokenizer. Accuracy ~85-90%, variance ±15%. Always show the disclaimer.
Standalone skill from TBA — Token Budget Advisor for Claude Code. Original project also ships a Python estimator script, but this repository keeps the skill self-contained and heuristic-only.