Side-by-side comparison of Anthropic Claude API vs Google Gemini API: pricing, context window, Python SDK, tool use, and migration code. With working examples.
Evaluating Claude vs Gemini for your next project? Here's a practical comparison based on what actually matters when building — not benchmark scores.
Quick comparison
Feature
Claude API (Anthropic)
Gemini API (Google)
Python SDK
pip install anthropic
pip install google-genai
Best general model
claude-sonnet-4-6 ($3/$15)
gemini-1.5-pro ($1.25–$2.50/$5–$10)
Fast/cheap model
claude-haiku-4-5 ($0.80/$4)
gemini-2.0-flash (~$0.10/$0.40)
Max context window
200K tokens
1M tokens (Gemini 1.5)
Prompt caching
Yes (native, up to 90% savings)
Context caching (Vertex AI only)
Tool / function calling
Yes (tools param)
Yes (tools param)
Vision / image input
Yes (URL or base64)
Yes (URL or base64)
Audio input
No (text/image/video frames only)
Yes (native audio)
Batch API
Yes (50% discount)
Yes (Vertex AI batch)
Cloud integration
AWS Bedrock / standalone
Deep Google Cloud / Vertex AI
Gemini → Claude migration (Python)
# BEFORE (Gemini)
from google import genai
client = genai.Client(api_key="GEMINI_API_KEY")
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Summarize this text in 3 bullet points.",
config={"system_instruction": "You are a helpful assistant."}
)
print(response.text)
# AFTER (Claude)
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful assistant.", # ← top-level param
messages=[
{"role": "user", "content": "Summarize this text in 3 bullet points."}
]
)
print(response.content[0].text)
Tool use migration
# BEFORE (Gemini tool use)
from google.genai import types
weather_tool = types.Tool(function_declarations=[
types.FunctionDeclaration(
name="get_weather",
description="Get current weather for a city",
parameters=types.Schema(
type="OBJECT",
properties={"city": types.Schema(type="STRING")},
required=["city"]
)
)
])
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="What is the weather in Paris?",
config={"tools": [weather_tool]}
)
fn_call = response.candidates[0].content.parts[0].function_call
print(fn_call.name, fn_call.args)
# AFTER (Claude tool use)
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "What is the weather in Paris?"}],
tools=[{
"name": "get_weather",
"description": "Get current weather for a city",
"input_schema": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}]
)
for block in response.content:
if block.type == "tool_use":
print(block.name, block.input) # get_weather {"city": "Paris"}
When to choose Claude over Gemini
Prompt caching: Claude's native caching (available in the standard API, not just Vertex) saves up to 90% on repeated long context — critical for production chatbots and RAG. Gemini's context caching requires Vertex AI.
Cost predictability: Claude has straightforward per-token pricing with no tier multipliers. Gemini 1.5 Pro has a 2× price jump above 128K tokens per request.
Document processing: Claude's PDF API and 200K context handle most real-world documents natively. Claude's safety training tends to be more permissive on complex technical document analysis.
Independence from Google Cloud: Claude runs on Anthropic's API directly, plus AWS Bedrock. No lock-in to Google infrastructure.
When to choose Gemini over Claude
Audio analysis: Gemini is the only major LLM API with native audio input. Essential for transcription + analysis pipelines.
1M token context: If you genuinely need to fit an entire large codebase or multi-hour transcript in one call, Gemini 1.5 Pro is the only option at this scale.
Google Cloud ecosystem: Deep Vertex AI integration, BigQuery ML, Cloud Run — if your infra is already Google, Gemini is simpler to operate.
No — they use different SDKs, endpoints, and message formats. Gemini uses `google-generativeai` or `google-genai`; Claude uses `anthropic`. However, migration is straightforward: replace the client, swap the model string, and adjust how you pass the system prompt (Claude takes it as a top-level `system` param; Gemini uses a `system_instruction` field).
How does Claude's pricing compare to Gemini 1.5 Pro / Gemini 2.0 Flash?
As of 2026: claude-sonnet-4-6 costs $3/$15 per million input/output tokens; claude-haiku-4-5 is $0.80/$4. Gemini 2.0 Flash is ~$0.10/$0.40 (very cheap but smaller context). Gemini 1.5 Pro is $1.25/$5 per million tokens (128K) or $2.50/$10 (>128K). Claude's 200K context window often costs less per task than Gemini 1.5 Pro at the 128K+ tier. Always verify current rates on the official pricing pages.
Which has a longer context window — Claude or Gemini?
Gemini 1.5 Pro and 1.5 Flash offer 1M token context (experimental). Claude Sonnet 4.6 and Opus 4.7 offer 200K tokens. For most developer tasks, 200K is more than sufficient; very few real workloads exceed it. Gemini's 1M window is useful for processing entire large codebases or hour-long audio transcripts in one call.
Does Claude support multimodal inputs like Gemini?
Yes. Both support text, images, and PDFs. Claude also supports video frames (via base64 image encoding) and has a dedicated Files API for large documents. Gemini natively supports audio and video. If your app needs direct audio/video streaming, Gemini has an edge; for document processing and long-text RAG, Claude's context window and caching provide better economics.
Which API is better for production applications?
Both are production-ready. Claude has native prompt caching (up to 90% cost savings on repeated context) which makes it significantly cheaper for chatbots and RAG with long system prompts. Gemini integrates tightly with Google Cloud (Vertex AI, Cloud Run, BigQuery ML). Choose Claude for cost-sensitive production workloads with long context; choose Gemini if you're already deep in the Google Cloud ecosystem.