Claude API FastAPI Example

Build a streaming Claude chatbot backend with FastAPI and the Anthropic SDK. Includes SSE streaming, async endpoints, CORS, and a minimal frontend.

💥 50p impulse-buy: Power Prompts PDF (first 10 buyers) 30 battle-tested Claude Code prompts · 8-page PDF · paste into CLAUDE.md and never re-type a prompt again · 50p impulse-buy, no commitment

FastAPI's async-first design pairs naturally with Claude's streaming API. This guide builds a production-ready streaming chatbot backend in under 50 lines.

Installation

pip install fastapi uvicorn anthropic

Minimal streaming endpoint

# main.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
import anthropic

app = FastAPI()
client = anthropic.AsyncAnthropic()  # reads ANTHROPIC_API_KEY from env

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:3000"],  # restrict in production
    allow_methods=["POST"],
    allow_headers=["*"],
)

class ChatRequest(BaseModel):
    message: str
    system: str = "You are a helpful assistant."

async def stream_claude(message: str, system: str):
    async with client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system=system,
        messages=[{"role": "user", "content": message}],
    ) as stream:
        async for text in stream.text_stream:
            yield f"data: {text}\n\n"
    yield "data: [DONE]\n\n"

@app.post("/chat/stream")
async def chat_stream(req: ChatRequest):
    return StreamingResponse(
        stream_claude(req.message, req.system),
        media_type="text/event-stream"
    )

Run locally

uvicorn main:app --reload
# POST to http://localhost:8000/chat/stream

Test with curl

curl -N -X POST http://localhost:8000/chat/stream \
  -H "Content-Type: application/json" \
  -d '{"message": "Explain FastAPI in 3 bullet points."}'

With conversation history

from typing import List

class Message(BaseModel):
    role: str  # "user" or "assistant"
    content: str

class ChatHistoryRequest(BaseModel):
    messages: List[Message]
    system: str = "You are a helpful assistant."

async def stream_with_history(messages: list, system: str):
    async with client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system=system,
        messages=[{"role": m.role, "content": m.content} for m in messages],
    ) as stream:
        async for text in stream.text_stream:
            yield f"data: {text}\n\n"
    yield "data: [DONE]\n\n"

@app.post("/chat/history/stream")
async def chat_history_stream(req: ChatHistoryRequest):
    return StreamingResponse(
        stream_with_history(req.messages, req.system),
        media_type="text/event-stream"
    )

Minimal HTML frontend (fetch SSE)

<!-- index.html -->
<textarea id="msg" rows="3" style="width:100%"></textarea>
<button onclick="send()">Send</button>
<pre id="out"></pre>
<script>
async function send() {
  const out = document.getElementById('out');
  out.textContent = '';
  const res = await fetch('http://localhost:8000/chat/stream', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify({message: document.getElementById('msg').value})
  });
  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  while (true) {
    const {done, value} = await reader.read();
    if (done) break;
    const chunk = decoder.decode(value);
    chunk.split('\n').forEach(line => {
      if (line.startsWith('data: ') && line !== 'data: [DONE]') {
        out.textContent += line.slice(6);
      }
    });
  }
}
</script>

FastAPI vs Flask for Claude streaming

CriterionFastAPIFlask
Async/await supportNative (ASGI)Requires flask[async] / gevent
Streaming responseStreamingResponse built-inGenerator + stream_with_context
Request validationPydantic (automatic)Manual or marshmallow
API docsAuto-generated /docsRequires Flask-Restx/Flasgger
Concurrent requestsExcellent (uvicorn + asyncio)Limited (threaded WSGI by default)

FastAPI is the recommended choice for Claude chatbot backends in 2026. Estimate token costs for your expected traffic at the Claude API Cost Calculator. For more API patterns, see the streaming Python guide and async Python patterns.

Frequently asked questions

How do I stream Claude responses in FastAPI?
Return a `StreamingResponse` with `media_type='text/event-stream'` and an async generator that yields SSE-formatted lines from `client.messages.stream()`. FastAPI and uvicorn handle the chunked transfer encoding automatically.
Which async client should I use with FastAPI?
Use `anthropic.AsyncAnthropic()` for fully non-blocking IO. This lets FastAPI handle concurrent requests without blocking the event loop during Claude API calls — essential for production throughput.
How do I handle CORS for a FastAPI + Claude backend?
Add `CORSMiddleware` with your frontend origin in `allow_origins`. For local dev use `['http://localhost:3000']`. Never use `allow_origins=['*']` in production if you're handling API keys server-side.
Can I keep conversation history in FastAPI?
For simple demos, store history in a Python dict keyed by session ID. For production, use Redis or a database. Pass the full messages list on each request — Claude's 200K context window means you rarely need to truncate.
What is the recommended FastAPI project structure for a Claude chatbot?
Keep the Anthropic client as a module-level singleton (`client = AsyncAnthropic()`). Use a separate `services/claude.py` for the streaming generator, `routers/chat.py` for the endpoints, and `main.py` for app setup and middleware.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript