Claude API Flask Example

Build a Claude chatbot backend with Flask in Python. Includes streaming with SSE using stream_with_context, conversation history endpoint, and CORS setup.

Flask is the most widely deployed Python web framework. This guide shows how to build a Claude chatbot backend with Flask — from minimal synchronous calls to full server-sent event (SSE) streaming.

Installation

Minimal synchronous endpoint

Streaming with SSE (server-sent events)

Consume the SSE stream in JavaScript

Conversation history endpoint

CORS setup for a separate frontend

Flask vs FastAPI for Claude streaming

Frequently asked questions

Criterion	Flask	FastAPI
Streaming	`stream_with_context` + generator	`StreamingResponse` built-in
Async support	Requires `flask[async]` / gevent	Native async/await (ASGI)
Request validation	Manual or marshmallow	Pydantic (automatic)
Ecosystem fit	Flask-Login, Flask-SQLAlchemy, Flask-Admin	Pydantic models, SQLModel
Deploy	Gunicorn (WSGI)	Uvicorn (ASGI)

Can Flask handle Claude streaming responses?

Yes. Use Flask's `stream_with_context` decorator with a generator function that yields SSE-formatted strings. The generator calls `client.messages.stream()` and yields each text delta. Run Flask with `threaded=True` (the default) so concurrent streaming requests don't block each other.

Flask vs FastAPI for Claude chatbot backends — which should I choose?

FastAPI is preferred for new projects: native async/await, automatic request validation via Pydantic, and built-in streaming via `StreamingResponse`. Choose Flask if your existing stack is WSGI-based, you need Flask-Login/Flask-SQLAlchemy integrations, or your team knows Flask deeply. Both work well for Claude streaming.

How do I add CORS to a Flask Claude chatbot?

Install flask-cors (`pip install flask-cors`) and call `CORS(app)` after creating your Flask app. For production, restrict origins: `CORS(app, origins=['https://yourfrontend.com'])` to prevent cross-origin abuse.

How do I keep conversation history in a Flask Claude backend?

Maintain a list of `{role, content}` dicts per session. The simplest approach is Flask-Session with a server-side store (Redis or filesystem). Pass the full history list as the `messages` parameter on each call. Claude's 200K context window holds roughly 150,000 words of history before you need to summarize.

What's the minimal Flask setup to call the Claude API?

Install `flask` and `anthropic`. Create a POST route, call `client.messages.create(...)`, and return `response.content[0].text`. That's it — no async, no streaming, 8 lines of Python. Add streaming later with `stream_with_context` when response latency matters.