Claude API in AWS Lambda (Python)

Deploy Claude API calls in AWS Lambda with Python. Working handler pattern, streaming response, timeout configuration, environment variable setup, and API Gateway wiring.

💥 50p impulse-buy: Power Prompts PDF (first 10 buyers) 30 battle-tested Claude Code prompts · 8-page PDF · paste into CLAUDE.md and never re-type a prompt again · 50p impulse-buy, no commitment

Deploying Claude API calls in AWS Lambda is straightforward but has three gotchas: timeout (default 3s is too short), cold start (import time), and packaging the anthropic library. This guide covers all three.

Lambda handler (minimal)

import anthropic
import json
import os

# Module-level client: reused across warm invocations (avoids re-init on each call)
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def handler(event, context):
    body = json.loads(event.get("body", "{}"))
    user_message = body.get("message", "Hello!")

    response = client.messages.create(
        model="claude-haiku-4-5-20251001",  # fast model reduces Lambda duration cost
        max_tokens=1024,
        messages=[{"role": "user", "content": user_message}]
    )

    return {
        "statusCode": 200,
        "headers": {"Content-Type": "application/json"},
        "body": json.dumps({"reply": response.content[0].text})
    }

Package as Lambda Layer

# On your local machine (match Python version to Lambda runtime)
mkdir -p python
pip install anthropic -t python/
zip -r anthropic-layer.zip python/

# Upload via AWS CLI
aws lambda publish-layer-version   --layer-name anthropic-sdk   --zip-file fileb://anthropic-layer.zip   --compatible-runtimes python3.12

# Attach to your function
aws lambda update-function-configuration   --function-name my-claude-function   --layers arn:aws:lambda:us-east-1:123456789012:layer:anthropic-sdk:1

Environment variable setup

# Set via CLI (never hardcode in source)
aws lambda update-function-configuration   --function-name my-claude-function   --timeout 30   --memory-size 256   --environment Variables="{ANTHROPIC_API_KEY=sk-ant-...}"

# Or via Secrets Manager (production recommended)
import boto3, json, os

_secret_cache = None

def get_api_key() -> str:
    global _secret_cache
    if _secret_cache is None:
        sm = boto3.client("secretsmanager", region_name="us-east-1")
        secret = sm.get_secret_value(SecretId=os.environ["SECRET_ARN"])
        _secret_cache = json.loads(secret["SecretString"])["anthropic_api_key"]
    return _secret_cache

Streaming via Lambda Function URL

# Requires: Lambda Function URL with RESPONSE_STREAM invoke mode
# Install: pip install awslambdaric>=1.2

import anthropic
import os
import json

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def handler(event, context):
    # For streaming, handler must be called via Lambda Function URL, not API Gateway REST
    body = json.loads(event.get("body", "{}"))
    user_message = body.get("message", "Hello!")

    def generate():
        with client.messages.stream(
            model="claude-sonnet-4-6",
            max_tokens=2048,
            messages=[{"role": "user", "content": user_message}]
        ) as stream:
            for text in stream.text_stream:
                yield text.encode()

    # awslambdaric streaming response
    return context.response_stream(generate(), content_type="text/plain")

Dockerfile for container deployment (no 250 MB layer limit)

FROM public.ecr.aws/lambda/python:3.12

COPY requirements.txt ./
RUN pip install -r requirements.txt

COPY handler.py ./

CMD ["handler.handler"]

# requirements.txt:
# anthropic>=0.40
# boto3

Lambda configuration quick reference

SettingRecommended valueWhy
Timeout30s (non-streaming) / 120s (streaming)Claude responses take 5-25s; default 3s always times out
Memory256 MBanthropic SDK + boto3 fit; more RAM also increases CPU
Runtimepython3.12Fastest cold-start among Lambda Python runtimes in 2026
ConcurrencyReserved = Anthropic tier limit / avg durationPrevents Lambda auto-scale from hitting API rate limits
API key storageSecrets Manager (prod), env var (dev)Env vars visible in console; Secrets Manager is audited

Estimate how Claude API costs scale with Lambda invocations using the Claude API Cost Calculator. For the FastAPI alternative (long-running server instead of serverless), see the FastAPI guide. For error handling patterns (retries, 429s), see the error handling guide.

Frequently asked questions

What Lambda timeout should I set for Claude API calls?
Set at least 30 seconds — Claude's default response for medium-length outputs can take 10-20s. For streaming or extended thinking, set 60-120s. The default Lambda timeout is 3 seconds, which will almost always time out. Update it in the Lambda console under Configuration → General configuration.
How do I store the Anthropic API key in Lambda?
Add it as a Lambda environment variable (`ANTHROPIC_API_KEY`) and restrict access via IAM. For production, store it in AWS Secrets Manager and retrieve it at cold start: `boto3.client('secretsmanager').get_secret_value(SecretId='prod/anthropic-key')['SecretString']`. Cache the value in a module-level variable so warm invocations skip the Secrets Manager call.
Can I stream Claude's response through API Gateway?
Yes — use Lambda Function URL with `RESPONSE_STREAM` invoke mode (not API Gateway REST/HTTP), or Lambda Response Streaming via `awslambda-web-adapter`. Standard API Gateway REST APIs buffer the full response and do not support streaming. Lambda Function URLs support streaming from Python with `awslambdaric >= 1.2`.
How do I install the anthropic package in Lambda?
Create a Lambda layer: `pip install anthropic -t python/`, zip it as `anthropic-layer.zip`, upload via console or CLI, and attach the layer to your function. Alternatively, use a Docker container image and install dependencies in the Dockerfile — containers have no 250 MB layer limit.
What IAM permissions does the Lambda function need?
The Lambda execution role needs no special AWS permissions to call the Anthropic API (it's an external HTTPS endpoint). It only needs the default `AWSLambdaBasicExecutionRole` for CloudWatch Logs. If you store the key in Secrets Manager, add `secretsmanager:GetSecretValue` for the specific secret ARN.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript