Gemini API

Pay-as-you-go · 1M context

Official Gemini 3.5 Flash and 3.1 Pro API billed per token—separate from gemini.google consumer subscription quotas

Token API

Core models

Gemini 3.5 FlashGemini 3.1 ProGemini 3.1 Flash-LiteGemini 2.5 ProGemini 2.5 FlashGemini 3.1 Flash ImageGemini 3.1 Flash Live

Gemini 3.5 Flash

Fast API flagship (gemini-3.5-flash): $1.50 input / $9 output per MTok, frontier intelligence + search grounding—Free tier for trials.

Gemini 3.1 Pro

Strongest Pro preview (gemini-3.1-pro-preview): $2/$12 per MTok (≤200K), multimodal agents and vibe-coding—Paid only.

Gemini 3.1 Flash-Lite

Best value (gemini-3.1-flash-lite): $0.25/$1.50 per MTok—first choice for high-volume agents and translation.

Gemini 2.5 Pro

Prior Pro (gemini-2.5-pro): $1.25/$10 per MTok (≤200K), coding and complex reasoning—prefer 3.x for new work.

Gemini 3.1 Flash Image

Image generation API (gemini-3.1-flash-image): $0.50/M text input, image output priced by resolution (~$0.045–$0.151/image)—separate from chat billing.

See the official site for more models

Additional core model names still appear above, with full details on the latest official page.

Open official site

Plan details

Gemini 3.5 Flash

Flagship speedRecommended

Input

$1.50

Output

Official

Gemini 3.5 Flash

Flagship speedRecommended

Input

$1.50

Output

Official

Usage

Model id gemini-3.5-flash—the current fast API flagship at $1.50/M input and $9/M output (Standard), combining frontier intelligence with superior search and grounding.

Models

Free tier callable at no token charge in AI Studio (rate-limited); Paid unlocks higher RPS, context caching, and Grounding with Google Search/Maps.

Highlights

Batch API at $0.75/M input and $4.50/M output; Flex and Priority at $0.75/$4.50 and $2.70/$16.20 per MTok for latency-tolerant vs peak-throughput workloads.
Default for search-grounded fast agent loops and daily production API; escalate to 3.1 Pro for complex multimodal agents.

Best for

Search-grounded agents, fast production API, everyday intelligent apps

Gemini 3.1 Pro

Flagship Pro

Input

Output

$12

Official

Gemini 3.1 Pro

Flagship Pro

Input

Output

$12

Official

Usage

Model id gemini-3.1-pro-preview (and customtools variant)—among the strongest for multimodal understanding plus agentic and vibe-coding workloads.

Models

Standard pricing: $2 input / $12 output per MTok for prompts ≤200K tokens; $4 / $18 per MTok above 200K.

Highlights

Paid tier only (not on Free); supports context caching, 50% Batch savings, and grounding—for complex repo analysis, long-document agents, and high-autonomy workflows.
Same model family as gemini.google subscription 3.1 Pro, but API bills per token independently of Plus/Pro/Ultra usage multipliers.

Best for

Complex agents, long-context multimodal, vibe-coding, and repo-scale analysis

Gemini 3.1 Flash-Lite

Best value

Input

$0.25

Output

$1.50

Official

Gemini 3.1 Flash-Lite

Best value

Input

$0.25

Output

$1.50

Official

Usage

Model id gemini-3.1-flash-lite—Google's most cost-efficient tier at $0.25/M text/image/video input and $1.50/M output (Standard).

Models

Optimized for high-volume agent tasks, translation, and simple data processing—Free tier at no charge; Paid adds caching and higher RPS.

Highlights

Batch API at $0.125/M input and $0.75/M output—suited to routing layers, batch classification, and cost-sensitive scale.
Audio input Standard $0.50/M, Batch $0.25/M—estimate separately for speech pipelines.

Best for

High-concurrency light tasks, translation, routing, and sub-agents

Gemini 2.5 Pro

Prior-gen Pro

Input

$1.25

Output

$10

Official

Gemini 2.5 Pro

Prior-gen Pro

Input

$1.25

Output

$10

Official

Usage

Model id gemini-2.5-pro—the prior multipurpose flagship strong at coding and complex reasoning—prefer 3.5 Flash / 3.1 Pro for new integrations.

Models

Standard: $1.25 input / $10 output per MTok for prompts ≤200K; $2.50 / $15 per MTok above 200K.

Highlights

Free tier available with limits; Grounding with Google Search 1,500 RPD free then $35 per 1k grounded prompts.
Existing 2.5 Pro integrations can keep billing; migration plans should weigh 3.1 Pro multimodal agent gains.

Best for

Legacy 2.5 Pro integrations, coding and complex reasoning (migrating)

Gemini 2.5 Flash

Hybrid reasoning

Input

$0.30

Output

$2.50

Official

Gemini 2.5 Flash

Hybrid reasoning

Input

$0.30

Output

$2.50

Official

Usage

Model id gemini-2.5-flash—the first hybrid reasoning Flash with thinking budgets and a 1M-token context window.

Models

Standard text/image/video input $0.30/M tokens and $2.50/M output; Free tier available at no token charge.

Highlights

Grounding with Google Search shares 500 RPD (Free) / 1,500 RPD (Paid) free allowance with Flash-Lite.
Suited to production needing controllable thinking depth and 1M context without 3.5 Flash pricing.

Best for

1M context, controllable thinking, mid-cost production API

Gemini 3.1 Flash Image

Image generation

Input

$0.50

Output

$0.067

Official

Gemini 3.1 Flash Image

Image generation

Input

$0.50

Output

$0.067

Official

Usage

Model id gemini-3.1-flash-image—for fast interactive image generation and editing; Standard text/image input $0.50/M tokens.

Models

Image output billed per token ($60/M image tokens): ~$0.045 at 0.5K, $0.067 at 1K, $0.101 at 2K, $0.151 at 4K—different from plain chat API billing.

Highlights

Batch API at $0.25/M input and $30/M image output (~half price); Paid tier only—suited to high-throughput visual generation pipelines.

Best for

Image generation, editing, and high-throughput visual API

Gemini 3.1 Flash Live

Real-time dialogue

Input

$0.75

Output

$4.50

Official

Gemini 3.1 Flash Live

Real-time dialogue

Input

$0.75

Output

$4.50

Official

Usage

Model id gemini-3.1-flash-live-preview—low-latency audio-to-audio real-time dialogue with acoustic nuance, numeric precision, and multimodal awareness.

Models

Paid Standard: text $0.75 input / $4.50 output per MTok; audio $3 or ~$0.005/min input, $12 or ~$0.018/min output; image/video $1 or ~$0.002/min input.

Highlights

Free tier available with rate limits—suited to voice assistants, real-time translation, and voice-first agents; estimate audio/video minute costs before integration.

Best for

Real-time voice dialogue, Live API, and voice-first apps

Notes

Prices below are Paid tier Standard processing in USD per 1M tokens; Free tier in AI Studio offers free input/output on select models (content may improve products)—upgrade to Paid for production.
Gemini 3.1 Pro and 2.5 Pro use tiered pricing at ≤200K vs >200K prompt tokens (e.g. 3.1 Pro Standard: $2/$12 vs $4/$18 per MTok).
Batch API is ~50% off input/output; context caching adds storage fees (typically $0.50–$4.50 per 1M tokens/hour, model-dependent).
Image generation (3.1 Flash Image, etc.) and Live API audio/video use different billing from plain text chat—estimate per use case before integration.

Supported coding tools

Gemini APIGoogle AI StudioVertex AIContext CachingBatch APIGroundingLive API

Pricing and model data sourced from official vendor websites

FAQ

General·7

General

7 条