Anthropic API成本优化指南

v20260423

clade-cost-tuning

本指南详细介绍了在使用Anthropic Claude API时降低成本的专业策略。内容涵盖了根据任务复杂度选择合适的模型（Haiku, Sonnet, Opus）、实施提示词缓存、利用消息批量处理，以及高效的令牌数量削减技巧，帮助开发者构建成本效益更高的AI应用。

Anthropic Claude 成本优化 API 大型语言模型提示工程定价

获取技能

227 次下载

概览

Anthropic Cost Tuning

Overview

Anthropic charges per token. Input tokens, output tokens, and cached tokens each have different prices. Here's how to minimize cost without losing quality.

Pricing (per million tokens)

Model	Input	Output	Cached Input	Batch Input	Batch Output
Claude Opus 4	$15.00	$75.00	$1.50	$7.50	$37.50
Claude Sonnet 4	$3.00	$15.00	$0.30	$1.50	$7.50
Claude Haiku 4.5	$0.80	$4.00	$0.08	$0.40	$2.00

Cost Reduction Strategies

Instructions

Step 1: Right-Size Your Model

// DON'T use Opus for everything
// DO match model to task complexity:

// Simple classification/extraction → Haiku (cheapest)
const category = await classify(text, 'claude-haiku-4-5-20251001');

// General coding/writing → Sonnet (balanced)
const code = await generate(spec, 'claude-sonnet-4-20250514');

// Complex multi-step reasoning → Opus (best quality)
const analysis = await analyze(data, 'claude-opus-4-20250514');

Step 2: Prompt Caching (90% off input tokens)

// Cache your system prompt — pays for itself after 2 calls
const message = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  system: [{
    type: 'text',
    text: longSystemPrompt, // Must be 1024+ tokens
    cache_control: { type: 'ephemeral' }, // Cache for 5 minutes
  }],
  messages,
}, {
  headers: { 'claude-beta': 'prompt-caching-2024-07-31' },
});

// First call: cache_creation_input_tokens charged at 1.25x
// Subsequent calls: cache_read_input_tokens charged at 0.1x (90% savings!)

Step 3: Message Batches (50% off everything)

// For non-urgent work — 50% cheaper, 24h processing SLA
const batch = await client.messages.batches.create({
  requests: prompts.map((p, i) => ({
    custom_id: `job-${i}`,
    params: {
      model: 'claude-sonnet-4-20250514',
      max_tokens: 1024,
      messages: [{ role: 'user', content: p }],
    },
  })),
});
// Sonnet: $1.50/$7.50 per MTok instead of $3/$15

Step 4: Reduce Token Count

// Trim conversation history — keep system + last N turns
function trimMessages(messages: MessageParam[], maxTurns = 10) {
  if (messages.length <= maxTurns * 2) return messages;
  return messages.slice(-(maxTurns * 2));
}

// Set tight max_tokens — don't pay for output you won't use
const message = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 256, // Not 4096 if you only need a short answer
  messages,
});

// Use concise system prompts
system: 'Reply in 1-2 sentences.' // Not a 500-word personality description

Step 5: Monitor Usage

// Log every call's cost
function logUsage(message: Anthropic.Message) {
  const { input_tokens, output_tokens } = message.usage;
  const cost = (input_tokens * 3 + output_tokens * 15) / 1_000_000; // Sonnet pricing
  console.log(`Tokens: ${input_tokens}in/${output_tokens}out | Cost: $${cost.toFixed(4)}`);
}

Cost Comparison Example

Processing 10,000 documents (avg 500 tokens each, 200 token response):

Strategy	Input Cost	Output Cost	Total
Opus, no optimization	$75.00	$150.00	$225.00
Sonnet, no optimization	$15.00	$30.00	$45.00
Sonnet + Batches	$7.50	$15.00	$22.50
Haiku + Batches	$2.00	$4.00	$6.00
Haiku + Batches + Caching	~$1.00	$4.00	~$5.00

Output

Model selection optimized per task complexity (Haiku for simple, Sonnet for balanced, Opus for complex)
Prompt caching enabled for repeated system prompts
Batch processing configured for non-urgent workloads
Token usage logged with cost estimates per request
Spending alerts configured in Anthropic console

Error Handling

Error	Cause	Solution
API Error	Check error type and status code	See `clade-common-errors`

Examples

See Pricing table, five numbered strategy sections with code, and the Cost Comparison Example table showing savings from $225 to $5 for 10K documents.

Resources

Next Steps

See clade-performance-tuning for latency optimization.

Prerequisites

Completed clade-install-auth
Active API usage to optimize
Access to Anthropic console for usage monitoring

信息

Category 人工智能

Name clade-cost-tuning

版本 v20260423

大小 3.58KB

Source jeremylongshore/claude-code-plugins-plus-skills

更新时间 2026-04-26