Claude API with Express.js

How to call the Claude API from an Express.js backend in 2026. Covers minimal handler, streaming SSE to browser, conversation history with sessions, and rate-limit middleware.

💥 50p impulse-buy: Power Prompts PDF (first 10 buyers) 30 battle-tested Claude Code prompts · 8-page PDF · paste into CLAUDE.md and never re-type a prompt again · 50p impulse-buy, no commitment

Express.js is the most widely used Node.js web framework. This guide shows how to wire Claude into an Express backend — from a minimal single-endpoint handler through streaming SSE, conversation history, and production-ready rate limiting.

Installation

npm install express @anthropic-ai/sdk express-session

Minimal Express handler

import express from "express";
import Anthropic from "@anthropic-ai/sdk";

const app = express();
app.use(express.json());

const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env

app.post("/api/chat", async (req, res) => {
  const { message } = req.body;

  const response = await client.messages.create({
    model: "claude-haiku-4-5-20251001",
    max_tokens: 1024,
    messages: [{ role: "user", content: message }],
  });

  res.json({ reply: response.content[0].text });
});

app.listen(3000, () => console.log("Listening on :3000"));

Streaming SSE to browser

app.post("/api/chat/stream", async (req, res) => {
  const { message } = req.body;

  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("X-Accel-Buffering", "no"); // disable Nginx buffering

  const stream = client.messages.stream({
    model: "claude-haiku-4-5-20251001",
    max_tokens: 1024,
    messages: [{ role: "user", content: message }],
  });

  for await (const event of stream) {
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      res.write(`data: ${JSON.stringify({ text: event.delta.text })}

`);
    }
  }
  res.write("data: [DONE]

");
  res.end();
});

Conversation history with express-session

import session from "express-session";

app.use(session({
  secret: process.env.SESSION_SECRET,
  resave: false,
  saveUninitialized: true,
  cookie: { maxAge: 3600000 }, // 1h session
}));

app.post("/api/chat/history", async (req, res) => {
  const { message } = req.body;
  if (!req.session.history) req.session.history = [];

  req.session.history.push({ role: "user", content: message });

  // cap at 20 turns to stay within context
  const trimmed = req.session.history.slice(-20);

  const response = await client.messages.create({
    model: "claude-haiku-4-5-20251001",
    max_tokens: 1024,
    messages: trimmed,
  });

  const reply = response.content[0].text;
  req.session.history.push({ role: "assistant", content: reply });

  res.json({ reply, turns: req.session.history.length / 2 });
});

// DELETE /api/chat/history resets context
app.delete("/api/chat/history", (req, res) => {
  req.session.history = [];
  res.json({ ok: true });
});

Rate-limit middleware (per-user token bucket)

import rateLimit from "express-rate-limit";

const chatLimiter = rateLimit({
  windowMs: 60_000,  // 1 minute
  max: 20,           // 20 requests per IP per minute
  standardHeaders: true,
  legacyHeaders: false,
  handler: (req, res) => res.status(429).json({
    error: "Too many requests. Please wait.",
    retryAfter: Math.ceil(req.rateLimit.resetTime / 1000),
  }),
});

app.post("/api/chat", chatLimiter, async (req, res) => { /* ... */ });

Express vs Next.js API routes vs plain Node.js http

ApproachBest forStreamingSession supportOverhead
Express.jsStandalone REST API, microserviceSSE or WebSocketexpress-session + RedisMinimal
Next.js API routesFull-stack React appReadableStream (App Router)iron-session or Auth.jsBundled with frontend
Node.js httpUltra-minimal, no dependenciesManualManualZero
FastifyHigh-throughput APIsSSE plugin@fastify/sessionLower than Express

Estimate API costs before going to production with the Claude API Cost Calculator. For the Next.js integration pattern (App Router + Server Components), see the Next.js example.

Frequently asked questions

Why use Express.js as a proxy for the Claude API?
Browser clients cannot safely embed an Anthropic API key — it would be exposed in network requests. An Express backend keeps the key on the server, applies rate limiting per user, and adds authentication before forwarding to Claude.
How do I stream Claude responses to the browser from Express?
Set `res.setHeader('Content-Type', 'text/event-stream')` and use the Anthropic SDK's streaming API (`client.messages.stream(...)`). Pipe each `text` delta to the response with `res.write('data: ...')`. Call `res.end()` when the stream closes.
How do I maintain conversation history in an Express session?
Store the `messages` array in `req.session.history` (using express-session + connect-redis). Append each user message and assistant reply before the next call. Cap history at 20 turns to avoid exceeding context limits.
What model should I use for a Claude chatbot backend?
Use `claude-haiku-4-5-20251001` for high-volume, low-latency chat endpoints — it's 10× cheaper than Sonnet and fast enough for interactive use. Use `claude-sonnet-4-6` when accuracy matters more than cost.
How do I handle Anthropic rate limit errors (429) in Express?
Catch `APIStatusError` with `status === 429`, read the `retry-after` header, and return a 429 to the client with a `Retry-After` header. Implement exponential backoff for automated retries and use a per-user token bucket to avoid triggering limits.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript