Claude API WebSocket Streaming Python: Real-Time Chat (2026)

Build real-time Claude API streaming over WebSocket in Python (2026). Uses FastAPI WebSockets + AsyncAnthropic to push tokens to the browser as they arrive.

The Anthropic Python SDK's streaming API returns tokens as they are generated, making it ideal for real-time chat UIs. Pairing it with FastAPI WebSockets lets you push each token to the browser the instant it arrives — no polling needed.

Installation

Backend: FastAPI WebSocket + AsyncAnthropic

Start the server

Browser client (vanilla JS)

Multi-turn conversation history

Cancellable streaming

SSE vs WebSocket comparison

Feature	Server-Sent Events (SSE)	WebSocket
Direction	Server → client only	Bidirectional
Cancel mid-stream	Client closes connection	Send cancel message
Browser support	All modern browsers	All modern browsers
Proxy / CDN	Works with most CDNs	Requires WS-aware proxy
Best for	Static display, summaries	Interactive chat, multi-turn

For per-request cost calculations using the token counts returned in the done event, use the Claude API Cost Calculator. For the pure HTTP streaming pattern without WebSockets, see the streaming guide.

Frequently asked questions

Can I stream Claude API responses over WebSocket?

Yes. Use `AsyncAnthropic` and iterate over `client.messages.stream()` — each `text_delta` event is one or a few tokens. Push each delta to the WebSocket client with `await websocket.send_text(delta)`. The browser receives tokens as fast as Claude generates them.

What is the difference between SSE and WebSocket for Claude streaming?

Server-Sent Events (SSE) is one-directional (server → browser) and simpler to set up. WebSocket is bidirectional, so the client can interrupt the stream mid-generation. For a chat UI where users can cancel, WebSocket wins. For a static display, SSE is simpler.

How do I cancel a Claude stream mid-generation over WebSocket?

Listen for a 'cancel' message from the client. In the stream loop, check a cancellation flag and `break` out of the iteration — the Anthropic SDK will close the HTTP stream cleanly. Then send a final `{type:'done',cancelled:true}` message.

How do I handle WebSocket disconnects while Claude is streaming?

Wrap the stream loop in a try/except for `websockets.exceptions.ConnectionClosed` (or FastAPI's `WebSocketDisconnect`). On disconnect, break the loop — the SDK will automatically clean up the open HTTP connection to Anthropic.

Does AsyncAnthropic work with FastAPI WebSocket endpoints?

Yes. FastAPI's `@app.websocket` handler is an async function, and `AsyncAnthropic` is fully async-compatible. Use `async with client.messages.stream(...) as stream:` and `async for text in stream.text_stream:` to push tokens.

Claude API WebSocket Streaming in Python