OpenAI API

Pay-as-you-go · 1M context

GPT-5.5 / 5.4 flagship API, billed per token—separate from ChatGPT subscription quotas; requires a platform.openai.com account

Token API

SubscriptionToken API

Official

Core models

GPT-5.5GPT-5.4GPT-5.4 miniGPT-Image-2GPT-Realtime-2GPT-Realtime-TranslateGPT-Realtime-Whisper

GPT-5.5

API flagship (gpt-5.5): Standard $5 input / $30 output per million tokens, 1M context, xhigh reasoning and full agent tools—the default for complex coding and professional tasks.

GPT-5.4

More affordable professional model (gpt-5.4): $2.50 input / $15 output per million, 1M context—a balanced production workhorse.

GPT-5.4 mini

Strongest mini (gpt-5.4-mini): $0.75 input / $4.50 output per million, 400K context—low latency and cost for sub-agents and light volume.

GPT-Image-2

Latest image generation/editing API billed by image and text modalities (e.g. $30/M image output)—used with images/generations and images/edits endpoints.

GPT-Realtime-2

Current top realtime voice model on openai.com/api/pricing—separate text/audio/image pricing, low-latency voice via v1/realtime.

See the official site for more models

Additional core model names still appear above, with full details on the latest official page.

Open official site

Plan details

GPT-5.5

Flagship reasoningRecommended

Input

Output

$30

Official

GPT-5.5

Flagship reasoningRecommended

Input

Output

$30

Official

Usage

Model id gpt-5.5—the current API flagship at $5/M input and $30/M output (Standard), 1M context, up to 128K output—suited to complex reasoning, professional coding, and multi-step agents.

Models

Supports reasoning.effort (none/low/medium/high/xhigh), Functions, Web search, File search, Computer use, Code interpreter, and more via Chat Completions or the Responses API.

Highlights

Cache-hit input is only $0.50/M; sessions with >272K input tokens bill at higher multipliers—enable prompt caching for repeated prefixes.
Best for production apps treating OpenAI as a core base—Codex-style coding agents, long-document analysis, and high-stakes knowledge workflows.

Best for

Complex reasoning and coding, production agent systems, and developers needing the strongest API tier

GPT-5.4

Balanced flagship

Input

$2.50

Output

$15

Official

GPT-5.4

Balanced flagship

Input

$2.50

Output

$15

Official

Usage

Model id gpt-5.4—a more affordable professional workhorse at $2.50/M input and $15/M output with 1M context, sitting between 5.5 and mini on capability and cost.

Models

Also supports Functions, Web search, File search, Computer use, and similar tools—good for high-volume production where you need strong capability at lower unit cost.

Highlights

Cache-hit input is $0.25/M; Batch API adds ~50% savings—suited to offline batch and delay-tolerant jobs.
When GPT-5.5 unit cost is too high but mini is not enough, 5.4 is usually the API sweet spot.

Best for

Mid-complexity production calls needing 1M context at lower cost than 5.5

GPT-5.4 mini

Fast & low-cost

Input

$0.75

Output

$4.50

Official

GPT-5.4 mini

Fast & low-cost

Input

$0.75

Output

$4.50

Official

Usage

Model id gpt-5.4-mini—the strongest mini tier at $0.75/M input and $4.50/M output, 400K context—for coding, computer use, and sub-agents.

Models

Lower latency and unit cost—suited to high-frequency light completion, routing/classification, batch formatting, and cost-sensitive volume.

Highlights

Cache-hit input as low as $0.075/M; docs also point to mini or nano variants when optimizing latency and cost.
Good for sub-agents, intermediate pipeline steps, and tiered architectures that start on mini and escalate hard tasks to 5.5.

Best for

High-volume batch calls, sub-agents, cost-sensitive production completion, and light reasoning

GPT-Image-2

Image generation

Input

Output

$30

Official

GPT-Image-2

Image generation

Input

Output

$30

Official

Usage

GPT-Image-2 is the latest image generation and editing model—image modality at $8/M input and $30/M output, text input at $5/M (each with cache-hit rates).

Models

Called via v1/images/generations and v1/images/edits—suited to in-app image generation and multimodal workflows; do not assume pure chat input/output rates.

Highlights

Images are tokenized for billing—per-image cost depends on resolution and prompt complexity; estimate with Playground or small tests before launch.

Best for

Products and creative workflows needing official image generation/editing APIs

GPT-Realtime-2

Realtime voice

Input

Output

$24

Official

GPT-Realtime-2

Realtime voice

Input

Output

$24

Official

Usage

GPT-Realtime-2 targets realtime voice—text at $4/M input and $24/M output; audio at $32/M input and $64/M output; image input at $5/M (each with cache rates).

Models

Integrates via v1/realtime sessions—for voice assistants, support bots, and low-latency conversational products; cost depends on audio duration and text mix.

Highlights

Billing differs from pure text chat—compare total cost against a Transcribe + chat + TTS pipeline when choosing architecture.

Best for

Realtime voice products, low-latency dialogue, and multimodal voice assistant developers

GPT-Realtime-Translate

Live translation

Input

$0.034

Output

min

Official

GPT-Realtime-Translate

Live translation

Input

$0.034

Output

min

Official

Usage

GPT-Realtime-Translate offers live speech translation at $0.034/min ($0.00057/sec)—suited to meetings, streaming, and multilingual support.

Models

Billed by audio duration rather than input/output token rates—estimate minute-level cost for long-running sessions.

Best for

Live interpretation, cross-language meetings, and streaming translation products

GPT-Realtime-Whisper

Streaming STT

Input

$0.017

Output

min

Official

GPT-Realtime-Whisper

Streaming STT

Input

$0.017

Output

min

Official

Usage

GPT-Realtime-Whisper streams speech to text at $0.017/min ($0.00028/sec) as the speaker talks.

Models

Suited to live captions, meeting notes, and voice input pipelines—different from batch models like GPT-4o Transcribe; pick by latency needs.

Best for

Live captions, meeting dictation, and low-latency speech input apps

Notes

Prices below are Standard processing for context under 270K; Batch API is ~50% off input/output; data residency adds 10% for GPT-5.5. GPT-5.5 prompts over 272K input tokens bill 2x input and 1.5x output for the full session.
Prompt cache-hit input: GPT-5.5 $0.50/M, GPT-5.4 $0.25/M, GPT-5.4 mini $0.075/M—design caching for repeated system prompts and long prefixes.
Web Search tool is $10 per 1k calls (search content tokens free); Containers bill by container size (switching to 20-minute sessions from 2026-03-31).
ChatGPT Plus/Business/Enterprise subscriptions do not include standard API usage; Playground and production API share balance and bill per dashboard usage reports.

Supported coding tools

OpenAI APIResponses APIChat CompletionsCodex CLICursorCline

Pricing and model data sourced from official vendor websites

FAQ

General·7

General

7 条