Claude API Rate Limits Guide

Understand Anthropic API rate limits: RPM, TPM, and tier thresholds. How to read rate limit headers, request tier upgrades, and implement retry logic.

💥 50p impulse-buy: Power Prompts PDF (first 10 buyers) 30 battle-tested Claude Code prompts · 8-page PDF · paste into CLAUDE.md and never re-type a prompt again · 50p impulse-buy, no commitment

Understanding rate limits helps you architect your application to avoid 429 errors and optimize throughput.

Rate limit tiers (2026)

TierRequirementSonnet 4.6 RPMSonnet 4.6 TPM
FreeCreate account525,000
Tier 1$5 credit purchase5050,000
Tier 2$40 total spend1,00080,000
Tier 3$200 total spend2,000160,000
Tier 4$400 total spend4,000400,000

Limits vary by model. Verify current limits at docs.anthropic.com/en/api/rate-limits.

Read rate limit headers in Python

import anthropic

client = anthropic.Anthropic()

try:
    response = client.messages.with_raw_response.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        messages=[{"role": "user", "content": "Hello"}]
    )
    headers = response.headers
    print("Requests remaining:", headers.get("anthropic-ratelimit-requests-remaining"))
    print("Tokens remaining:", headers.get("anthropic-ratelimit-tokens-remaining"))
    print("Resets at:", headers.get("anthropic-ratelimit-requests-reset"))
    message = response.parse()
    print(message.content[0].text)

except anthropic.RateLimitError as e:
    retry_after = e.response.headers.get("retry-after", "60")
    print(f"Rate limited. Wait {retry_after}s.")

Token-per-minute aware throttling

import time
from collections import deque

class TokenBudgetThrottler:
    def __init__(self, tpm_limit: int, window_seconds: int = 60):
        self.tpm_limit = tpm_limit
        self.window_seconds = window_seconds
        self.usage_log: deque[tuple[float, int]] = deque()

    def record_usage(self, tokens: int):
        now = time.time()
        self.usage_log.append((now, tokens))
        # Drop old entries
        while self.usage_log and now - self.usage_log[0][0] > self.window_seconds:
            self.usage_log.popleft()

    def tokens_in_window(self) -> int:
        return sum(t for _, t in self.usage_log)

    def wait_if_needed(self, estimated_tokens: int):
        while self.tokens_in_window() + estimated_tokens > self.tpm_limit:
            oldest = self.usage_log[0][0]
            sleep_for = self.window_seconds - (time.time() - oldest) + 0.1
            print(f"TPM budget exceeded. Sleeping {sleep_for:.1f}s...")
            time.sleep(max(0, sleep_for))
            # Purge expired entries
            now = time.time()
            while self.usage_log and now - self.usage_log[0][0] > self.window_seconds:
                self.usage_log.popleft()

See the error handling example for full retry logic. For pricing and cost estimation across tiers, see the rate limits vs tier explained page.

Frequently asked questions

What rate limit tiers does Anthropic use?
Anthropic uses 5 usage tiers (Free, Tier 1–4). Each tier has per-model RPM (requests per minute), TPM (tokens per minute), and daily token limits. Free tier is rate-limited but immediately available. Tier 1 requires a $5 credit purchase. Higher tiers require spending history.
Which headers tell me my remaining rate limit?
Anthropic returns `anthropic-ratelimit-requests-limit`, `anthropic-ratelimit-requests-remaining`, `anthropic-ratelimit-tokens-limit`, `anthropic-ratelimit-tokens-remaining`, and `anthropic-ratelimit-requests-reset` (ISO 8601 timestamp). The SDK exposes these via `e.response.headers` on rate limit errors.
My app hit a 429 error — what should I do?
Read the `retry-after` header value (seconds to wait). Implement exponential backoff: wait 1s, then 2s, then 4s on successive failures. The Anthropic SDK retries 2× automatically on 429 and 529 errors by default.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript