Claude API

Pay-as-you-go · 1M context

Official Opus 4.8, Sonnet 4.6, and Haiku 4.5 API billed per token—separate from Claude Pro/Max subscription quotas

Token API

Core models

Claude Opus 4.8Claude Sonnet 4.6Claude Haiku 4.5

Claude Opus 4.8

API flagship (claude-opus-4-8): $5 input / $25 output per MTok, 1M context, adaptive thinking—for complex agents and long-horizon coding.

Claude Sonnet 4.6

Production workhorse (claude-sonnet-4-6): $3 / $15 per MTok, 1M context—the default API choice for daily production.

Claude Haiku 4.5

Fast low-cost tier (claude-haiku-4-5): $1 / $5 per MTok, 200K context—for light and high-concurrency front-end tasks.

Plan details

Claude Opus 4.8

Flagship agentsRecommended

Input

Output

$25

Official

Claude Opus 4.8

Flagship agentsRecommended

Input

Output

$25

Official

Usage

Model id claude-opus-4-8—the strongest API model at $5/M input and $25/M output, 1M context, up to 128K output, with adaptive thinking and effort control.

Models

Positioned for complex reasoning, long-horizon agentic coding, and high-autonomy work—128K sync output on Messages API, up to 300K on Batch with the output beta header.

Highlights

5-minute cache writes $6.25/M on miss, reads $0.50/M on hit—suited to long system prompts, large repo context, and repeated agent pipelines.
Best for production agents, complex refactors, and high-stakes knowledge workflows treating Claude as a core reasoning base.

Best for

Complex agents, long-horizon coding, and production systems needing top reasoning

Claude Sonnet 4.6

Production workhorse

Input

Output

$15

Official

Claude Sonnet 4.6

Production workhorse

Input

Output

$15

Official

Usage

Model id claude-sonnet-4-6—the balanced daily workhorse at $3/M input and $15/M output, 1M context, up to 64K output.

Models

Supports extended and adaptive thinking at lower cost than Opus—suited to sustained coding, docs, PR review, and medium-complexity multi-turn chats.

Highlights

Cache hits at $0.30/M—most developers should default to Sonnet 4.6 in production and escalate hard tasks to Opus 4.8.
Suited to high-frequency API calls, shared team backends, and 1M-context workloads that do not need Opus throughout.

Best for

Daily production API, team backends balancing cost and capability, 1M-context workflows

Claude Haiku 4.5

Fast & low-cost

Input

Output

Official

Claude Haiku 4.5

Fast & low-cost

Input

Output

Official

Usage

Model id claude-haiku-4-5 (snapshot claude-haiku-4-5-20251001): $1/M input and $5/M output, 200K context, lowest latency.

Models

Near-frontier intelligence at the lowest cost—for quick Q&A, routing/classification, formatting, sub-agents, and lightweight pipeline steps.

Highlights

Supports extended thinking (no adaptive thinking); cache hits $0.10/M, with further savings via Batch API.
Suited to tiered architectures—Haiku first pass, Sonnet/Opus for hard tasks—and cost-sensitive batch workloads.

Best for

High-volume light calls, sub-agents, routing, and cost-sensitive batch tasks

Notes

Standard Claude API input/output list prices (USD per 1M tokens); prompt cache hits read at ~10% of input (e.g. Opus 4.8 hit $0.50/M).
Batch API is ~50% off input/output with async completion within 24 hours; Priority Tier adds higher throughput at official Priority multipliers.
From Opus 4.8, inference_geo "us" adds 1.1x on all token categories; Bedrock/Vertex regional and multi-region endpoints may add 10%.
Legacy Opus 4.1, Sonnet 4, and Opus 4 are deprecated with retirement dates; still-billed 4.5–4.7 pricing is in the docs table—new integrations should prefer Opus 4.8, Sonnet 4.6, and Haiku 4.5.

Supported coding tools

Anthropic APIMessages APIClaude CodeCursorClineAmazon BedrockVertex AI

Pricing and model data sourced from official vendor websites

FAQ

General·7

General

7 条