Deploy Claude API App to Google Cloud Run

Deploy a Python Flask app that calls the Claude API to Google Cloud Run. Store the Anthropic API key in Secret Manager, containerize with Docker, and deploy serverless.

💥 50p impulse-buy: Power Prompts PDF (first 10 buyers) 30 battle-tested Claude Code prompts · 8-page PDF · paste into CLAUDE.md and never re-type a prompt again · 50p impulse-buy, no commitment

Google Cloud Run is an ideal host for Claude API microservices: you pay only for request processing time, scale automatically to zero, and keep the API key secure in Secret Manager. This guide covers the complete deployment flow.

Project structure

claude-cloud-run/
├── app.py
├── requirements.txt
└── Dockerfile

Flask app (app.py)

import os
import anthropic
from flask import Flask, request, jsonify, Response, stream_with_context

app = Flask(__name__)
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

@app.route("/ask", methods=["POST"])
def ask():
    data = request.get_json(force=True)
    prompt = data.get("prompt", "")
    if not prompt:
        return jsonify({"error": "prompt required"}), 400

    msg = client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=512,
        messages=[{"role": "user", "content": prompt}]
    )
    return jsonify({"response": msg.content[0].text})

@app.route("/stream", methods=["POST"])
def stream():
    data = request.get_json(force=True)
    prompt = data.get("prompt", "")

    def generate():
        with client.messages.stream(
            model="claude-haiku-4-5-20251001",
            max_tokens=512,
            messages=[{"role": "user", "content": prompt}]
        ) as stream:
            for text in stream.text_stream:
                yield f"data: {text}\n\n"
        yield "data: [DONE]\n\n"

    return Response(
        stream_with_context(generate()),
        content_type="text/event-stream",
        headers={"X-Accel-Buffering": "no"}
    )

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))

requirements.txt

anthropic==0.40.0
flask==3.1.0
gunicorn==23.0.0

Dockerfile

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
ENV PORT=8080
CMD ["gunicorn", "--bind", "0.0.0.0:8080", "--workers", "1", "--threads", "8", "--timeout", "120", "app:app"]

Deploy to Cloud Run

# 1. Set your project
PROJECT_ID=your-project-id
REGION=us-central1

# 2. Store the Anthropic API key in Secret Manager
echo -n "sk-ant-..." | gcloud secrets create ANTHROPIC_API_KEY   --data-file=- --project=$PROJECT_ID

# 3. Build and push the container to Artifact Registry
gcloud builds submit --tag gcr.io/$PROJECT_ID/claude-app   --project=$PROJECT_ID

# 4. Deploy to Cloud Run with Secret Manager binding
gcloud run deploy claude-app   --image gcr.io/$PROJECT_ID/claude-app   --platform managed   --region $REGION   --memory 256Mi   --cpu 1   --max-instances 10   --concurrency 80   --timeout 120   --set-secrets=ANTHROPIC_API_KEY=ANTHROPIC_API_KEY:latest   --allow-unauthenticated   --project=$PROJECT_ID

Test the deployment

# Get the service URL
SERVICE_URL=$(gcloud run services describe claude-app   --region=$REGION --format='value(status.url)')

# Test the /ask endpoint
curl -X POST "$SERVICE_URL/ask"   -H "Content-Type: application/json"   -d '{"prompt": "Explain Cloud Run in one sentence."}'

# Test streaming
curl -X POST "$SERVICE_URL/stream"   -H "Content-Type: application/json"   -d '{"prompt": "Count from 1 to 5, one number per line."}'   --no-buffer

Add authentication (production)

# Remove --allow-unauthenticated and require a service account token instead
# In your CI/CD pipeline, call the endpoint with:
TOKEN=$(gcloud auth print-identity-token)
curl -X POST "$SERVICE_URL/ask"   -H "Authorization: Bearer $TOKEN"   -H "Content-Type: application/json"   -d '{"prompt": "Hello"}'

Deployment comparison

PlatformCold startCost (1K req/day)Secret managementBest for
Cloud Run1–3s~$0.05/daySecret Manager nativeMicroservices, GCP ecosystem
AWS Lambda0.5–2s~$0.02/dayParameter Store / Secrets ManagerAWS ecosystem, event-driven
Vercel Functions<200msFree tier sufficientEnv vars in dashboardFrontend apps, Next.js
Cloud Run + min-instances=10ms~$15/monthSecret Manager nativeLatency-sensitive APIs

For the AWS Lambda equivalent, see the AWS Lambda guide. For Vercel/Next.js deployment, see the Next.js example. Use the Claude API Cost Calculator to model your API costs before launch.

Frequently asked questions

How do I securely store the Anthropic API key in Google Cloud Run?
Use Google Secret Manager. Create the secret with `gcloud secrets create ANTHROPIC_API_KEY --data-file=-` and mount it as an environment variable in Cloud Run with `--set-secrets=ANTHROPIC_API_KEY=ANTHROPIC_API_KEY:latest`. Never pass the key as a plain `--set-env-vars` flag.
What is the minimum Cloud Run configuration for a Claude API app?
Set `--memory=256Mi`, `--cpu=1`, `--max-instances=10`, `--concurrency=80`. Claude API calls are I/O-bound (waiting on Anthropic's servers), so a single CPU instance can handle 80 concurrent requests. Increase memory only if you're loading large models or documents in-process.
How does Cloud Run pricing compare to other Claude API deployment options?
Cloud Run charges only for request processing time (billed per 100ms, ~$0.000024/vCPU-second). An app serving 1000 Claude API requests/day (avg 2s each) costs roughly $0.05/day on Cloud Run — far cheaper than always-on compute. Compare to AWS Lambda ($0.0000002/request) and Vercel serverless (free tier generous).
How do I handle Claude streaming responses on Cloud Run?
Enable streaming responses by setting the Flask response content-type to `text/event-stream` and disabling response buffering with the `X-Accel-Buffering: no` header. Cloud Run supports HTTP/1.1 streaming; set `--timeout=300` for long-running streams.
Does Cloud Run work with the Anthropic Python SDK?
Yes. Install `anthropic` in your requirements.txt. The SDK uses standard HTTPS, which Cloud Run supports natively. No special VPC or connector configuration is needed — outbound HTTPS to api.anthropic.com works by default.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript