How do I build a Discord bot with Claude?

Install discord.py and anthropic, create a Bot with message_content intent, and call the Anthropic API inside a command handler. Use interaction.response.defer() for slash commands to prevent Discord's 3-second timeout.

Does discord.py support slash commands with Claude?

Yes. Use discord.app_commands.CommandTree, define @tree.command functions, and call await tree.sync() on bot ready. Use interaction.response.defer() then interaction.followup.send() to handle Claude's response time.

How do I add conversation memory to a Claude Discord bot?

Maintain a per-channel history dict mapping channel_id to a list of {role, content} message dicts. Append each user message and assistant reply to the history, trim to the last 20 turns, and pass the full history to Claude on each call.

How do I stream Claude responses in Discord?

Discord does not support true streaming. Instead, post a placeholder message, stream Claude's output internally, then call message.edit() with the complete reply. For responses over 2000 characters, split into chunks and send each separately.

What model should I use for a Discord bot?

Use claude-haiku-4-5-20251001 for fast, low-cost responses in casual servers (sub-second latency, ~$0.08/1M input tokens). Use claude-sonnet-4-6 for complex reasoning tasks. Reserve Opus for premium tiers where latency and cost are secondary.

undefined

Step-by-step guide to building a Claude-powered Discord bot in Python using discord.py. Covers slash commands, thread-aware conversation history, streaming, and rate limiting.

Claude Discord Bot Python

Build a Discord bot powered by Claude using discord.py and the Anthropic SDK. This guide covers slash commands, per-channel conversation memory, and streaming replies.

Minimal Claude Discord bot

pip install discord.py anthropic python-dotenv

import discord
from discord.ext import commands
import anthropic
import os
from dotenv import load_dotenv

load_dotenv()
client_ai = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

intents = discord.Intents.default()
intents.message_content = True
bot = commands.Bot(command_prefix="!", intents=intents)

@bot.event
async def on_ready():
    print(f"Logged in as {bot.user}")

@bot.command(name="ask")
async def ask_claude(ctx, *, question: str):
    """!ask <question> — ask Claude anything"""
    async with ctx.typing():
        response = client_ai.messages.create(
            model="claude-haiku-4-5-20251001",
            max_tokens=1024,
            messages=[{"role": "user", "content": question}],
        )
    await ctx.send(response.content[0].text)

bot.run(os.getenv("DISCORD_BOT_TOKEN"))

Slash commands with discord.py app_commands

from discord import app_commands

tree = app_commands.CommandTree(bot)

@tree.command(name="claude", description="Ask Claude a question")
async def claude_slash(interaction: discord.Interaction, question: str):
    await interaction.response.defer()           # prevent 3-second timeout
    response = client_ai.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=1024,
        messages=[{"role": "user", "content": question}],
    )
    await interaction.followup.send(response.content[0].text)

@bot.event
async def on_ready():
    await tree.sync()                            # register slash commands globally
    print(f"Synced slash commands")

Thread-aware conversation history per channel

from collections import defaultdict

# In-memory per-channel conversation history (max 20 turns)
histories: dict[int, list[dict]] = defaultdict(list)
MAX_TURNS = 20

@bot.command(name="chat")
async def chat(ctx, *, message: str):
    channel_id = ctx.channel.id
    history = histories[channel_id]

    history.append({"role": "user", "content": message})
    if len(history) > MAX_TURNS * 2:
        history[:] = history[-MAX_TURNS * 2:]   # trim oldest turns

    async with ctx.typing():
        response = client_ai.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            system="You are a helpful Discord assistant. Be concise and friendly.",
            messages=history,
        )

    reply = response.content[0].text
    history.append({"role": "assistant", "content": reply})
    await ctx.send(reply)

@bot.command(name="reset")
async def reset(ctx):
    histories[ctx.channel.id].clear()
    await ctx.send("Conversation history cleared.")

Streaming responses via message edit

@bot.command(name="stream")
async def stream_reply(ctx, *, question: str):
    """Post a placeholder → stream internally → edit with full reply."""
    placeholder = await ctx.send("_Thinking…_")
    full_reply = ""
    with client_ai.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": question}],
    ) as stream:
        for text in stream.text_stream:
            full_reply += text
    # Discord messages are capped at 2000 chars; chunk if needed
    if len(full_reply) <= 2000:
        await placeholder.edit(content=full_reply)
    else:
        await placeholder.edit(content=full_reply[:2000])
        for chunk_start in range(2000, len(full_reply), 2000):
            await ctx.send(full_reply[chunk_start:chunk_start + 2000])

Rate-limit guard (per-user cooldown)

import time

user_last_call: dict[int, float] = {}
COOLDOWN_SECONDS = 10

@bot.command(name="safe_ask")
async def safe_ask(ctx, *, question: str):
    now = time.time()
    last = user_last_call.get(ctx.author.id, 0)
    if now - last < COOLDOWN_SECONDS:
        remaining = int(COOLDOWN_SECONDS - (now - last))
        await ctx.send(f"Please wait {remaining}s before asking again.")
        return
    user_last_call[ctx.author.id] = now

    response = client_ai.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=512,
        messages=[{"role": "user", "content": question}],
    )
    await ctx.send(response.content[0].text)

Claude Discord bot vs alternatives

Approach	Customizability	Setup time	Cost
Claude + discord.py (this guide)	Full — any system prompt, tools	~30 min	Claude API usage only
OpenAI + discord.py	Full — same pattern	~30 min	OpenAI API usage
MidJourney-style SaaS bot	None — fixed features	Minutes	Monthly subscription
No-code (Zapier/Make)	Low — template-based	~15 min	Zapier + Claude usage

Key non-obvious patterns: (1) Always defer() slash command interactions immediately — Discord's 3-second timeout is tight for API calls. (2) Keep per-channel history server-side (not per-user) so group conversations feel coherent. (3) Discord's 2000-char message cap requires chunking long Claude responses. For cost estimates before deploying, use the Claude API Cost Calculator. For the Slack equivalent of this pattern, see the Slack bot guide.

Free tools

Cost Calculator → API Cookbook → Diff Summarizer → Skills Browser →

More examples

Claude API Python QuickstartClaude API Node.js / TypeScript QuickstartClaude API Streaming in PythonClaude API Streaming in Node.js / TypeScriptClaude API Tool Use in PythonClaude API Tool Use in Node.js / TypeScript

⏸ Before you go…

If the snippet helped, the full Claude Code Power Prompts pack has 29 more — paste straight into CLAUDE.md. Pay what you can.
Pay what you want · from 30p →
8-page PDF · 30 prompts · 7-day refund