Claude API Pricing 2026

Complete Claude API pricing guide for 2026: all model prices, prompt caching savings, batch API discount, context window costs, and how to estimate your bill. With Python examples.

💥 50p impulse-buy: Power Prompts PDF (first 10 buyers) 30 battle-tested Claude Code prompts · 8-page PDF · paste into CLAUDE.md and never re-type a prompt again · 50p impulse-buy, no commitment

Claude API pricing is straightforward once you understand the three levers: model tier, prompt caching, and the Batch API. Here's everything you need to estimate and minimize your bill.

2026 Claude API pricing table

ModelInput ($/M tokens)Output ($/M tokens)ContextBest for
claude-haiku-4-5$0.80$4200KHigh-volume classification, short responses
claude-sonnet-4-6$3$15200KGeneral production workloads
claude-opus-4-7$15$75200KComplex reasoning, highest quality

Prompt caching prices

ModelCache write ($/M)Cache read ($/M)Savings vs normal input
claude-haiku-4-5$1.00$0.0890% on reads
claude-sonnet-4-6$3.75$0.3090% on reads
claude-opus-4-7$18.75$1.5090% on reads

Batch API: 50% discount for async workloads

ModelBatch input ($/M)Batch output ($/M)
claude-haiku-4-5$0.40$2
claude-sonnet-4-6$1.50$7.50
claude-opus-4-7$7.50$37.50

Cost estimation in Python

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Summarize this 500-word article: ..."}]
)

usage = response.usage
input_cost  = usage.input_tokens  * 3.00 / 1_000_000
output_cost = usage.output_tokens * 15.00 / 1_000_000
print(f"Input:  {usage.input_tokens} tokens = ${input_cost:.6f}")
print(f"Output: {usage.output_tokens} tokens = ${output_cost:.6f}")
print(f"Total:  ${input_cost + output_cost:.6f}")

Prompt caching example — production chatbot

import anthropic

client = anthropic.Anthropic()

# 2,000-token system prompt cached across all user requests
SYSTEM = "You are a senior software engineer with 15 years of experience..." + " expert knowledge " * 400

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=[{
        "type": "text",
        "text": SYSTEM,
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{"role": "user", "content": "How do I handle database migrations in production?"}]
)

usage = response.usage
print(f"Cache read tokens: {usage.cache_read_input_tokens}")
print(f"Cache write tokens: {usage.cache_creation_input_tokens}")

# With 2,000 cached input tokens at $0.30/M vs $3/M:
# Savings per call: 2,000 × (3.00 - 0.30) / 1,000,000 = $0.0054
# At 100K calls/month: $540/month saved

Cost comparison: models by task type

TaskTypical tokensHaiku costSonnet costOpus cost
Classify a tweet (in/out)100 / 10$0.000088$0.00045$0.00225
Summarize article (in/out)800 / 200$0.0015$0.0054$0.027
Code review 500 lines3K / 1K$0.0064$0.024$0.12
Analyze 50-page PDF40K / 2K$0.04$0.15$0.75
Analyze 150K-token codebase150K / 5K$0.14$0.525$2.625

Cost optimization checklist

Use the Claude API Cost Calculator for interactive estimation with your specific token volumes. For Sonnet vs GPT-4o pricing comparison, see Claude Sonnet 4.6 vs GPT-4o. For cost optimization patterns, see Claude API Cost Optimization.

Frequently asked questions

How much does the Claude API cost in 2026?
As of 2026: claude-haiku-4-5 is $0.80/$4 per million input/output tokens. claude-sonnet-4-6 is $3/$15. claude-opus-4-7 is $15/$75. With prompt caching, cached input tokens cost 90% less. The Batch API cuts all prices 50% for async workloads. Always verify current rates on the Anthropic pricing page as prices change.
Does Anthropic have a free tier for the Claude API?
Anthropic does not advertise a permanent free tier for the production API. New accounts may get a small credit to test the API. For budget-conscious development, use claude-haiku-4-5 (cheapest model) and the Batch API (50% discount). Claude.ai (the web interface) has a free plan but it does not give API access.
What is prompt caching and how much does it save?
Prompt caching stores frequently-used context blocks (system prompts, documents, examples) server-side. Subsequent requests that hit the cache pay $0.30/$3.75 per million tokens (vs $3/$15 for Sonnet) — a 90% reduction on input and 75% on cache write. Essential for production chatbots, RAG pipelines, and any app that reuses a long system prompt.
How do I estimate my Claude API bill?
Multiply your expected input tokens by the input price and output tokens by the output price. 1 token ≈ 0.75 words. A 1,000-word user message + 500-word system prompt ≈ 2,000 input tokens; a 500-word response ≈ 667 output tokens. With claude-sonnet-4-6: 2,000 input × $0.000003 = $0.006 + 667 output × $0.000015 = $0.01 per call. Use the Claude Cost Calculator for detailed estimates.
Is there a Claude API trial or sandbox?
New Anthropic accounts receive API credits to test. The API itself has no separate sandbox — you use the production endpoint with test keys. Unlike OpenAI, Anthropic does not have a 'playground' tier separate from the API; use the Claude.ai web interface for interactive testing, then move to the API for integration.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript