Back to all plans

Gemini API

Pay-as-you-go · 1M context

Official Gemini 3.5 Flash and 3.1 Pro API billed per token—separate from gemini.google consumer subscription quotas

Token API
SubscriptionToken API
Official

Core models

Gemini 3.5 FlashGemini 3.1 ProGemini 3.1 Flash-LiteGemini 2.5 ProGemini 2.5 FlashGemini 3.1 Flash ImageGemini 3.1 Flash Live
Gemini 3.5 Flash

Fast API flagship (gemini-3.5-flash): $1.50 input / $9 output per MTok, frontier intelligence + search grounding—Free tier for trials.

Gemini 3.1 Pro

Strongest Pro preview (gemini-3.1-pro-preview): $2/$12 per MTok (≤200K), multimodal agents and vibe-coding—Paid only.

Gemini 3.1 Flash-Lite

Best value (gemini-3.1-flash-lite): $0.25/$1.50 per MTok—first choice for high-volume agents and translation.

Gemini 2.5 Pro

Prior Pro (gemini-2.5-pro): $1.25/$10 per MTok (≤200K), coding and complex reasoning—prefer 3.x for new work.

Gemini 3.1 Flash Image

Image generation API (gemini-3.1-flash-image): $0.50/M text input, image output priced by resolution (~$0.045–$0.151/image)—separate from chat billing.

See the official site for more models

Additional core model names still appear above, with full details on the latest official page.

Open official site

Plan details

Gemini 3.5 Flash

Flagship speedRecommended
Input
$1.50
Output
$9
Official
Usage
Model id gemini-3.5-flash—the current fast API flagship at $1.50/M input and $9/M output (Standard), combining frontier intelligence with superior search and grounding.
Models
Free tier callable at no token charge in AI Studio (rate-limited); Paid unlocks higher RPS, context caching, and Grounding with Google Search/Maps.
Highlights
Batch API at $0.75/M input and $4.50/M output; Flex and Priority at $0.75/$4.50 and $2.70/$16.20 per MTok for latency-tolerant vs peak-throughput workloads.
Default for search-grounded fast agent loops and daily production API; escalate to 3.1 Pro for complex multimodal agents.
Best for
Search-grounded agents, fast production API, everyday intelligent apps

Gemini 3.1 Pro

Flagship Pro
Input
$2
Output
$12
Official
Usage
Model id gemini-3.1-pro-preview (and customtools variant)—among the strongest for multimodal understanding plus agentic and vibe-coding workloads.
Models
Standard pricing: $2 input / $12 output per MTok for prompts ≤200K tokens; $4 / $18 per MTok above 200K.
Highlights
Paid tier only (not on Free); supports context caching, 50% Batch savings, and grounding—for complex repo analysis, long-document agents, and high-autonomy workflows.
Same model family as gemini.google subscription 3.1 Pro, but API bills per token independently of Plus/Pro/Ultra usage multipliers.
Best for
Complex agents, long-context multimodal, vibe-coding, and repo-scale analysis

Gemini 3.1 Flash-Lite

Best value
Input
$0.25
Output
$1.50
Official
Usage
Model id gemini-3.1-flash-lite—Google's most cost-efficient tier at $0.25/M text/image/video input and $1.50/M output (Standard).
Models
Optimized for high-volume agent tasks, translation, and simple data processing—Free tier at no charge; Paid adds caching and higher RPS.
Highlights
Batch API at $0.125/M input and $0.75/M output—suited to routing layers, batch classification, and cost-sensitive scale.
Audio input Standard $0.50/M, Batch $0.25/M—estimate separately for speech pipelines.
Best for
High-concurrency light tasks, translation, routing, and sub-agents

Gemini 2.5 Pro

Prior-gen Pro
Input
$1.25
Output
$10
Official
Usage
Model id gemini-2.5-pro—the prior multipurpose flagship strong at coding and complex reasoning—prefer 3.5 Flash / 3.1 Pro for new integrations.
Models
Standard: $1.25 input / $10 output per MTok for prompts ≤200K; $2.50 / $15 per MTok above 200K.
Highlights
Free tier available with limits; Grounding with Google Search 1,500 RPD free then $35 per 1k grounded prompts.
Existing 2.5 Pro integrations can keep billing; migration plans should weigh 3.1 Pro multimodal agent gains.
Best for
Legacy 2.5 Pro integrations, coding and complex reasoning (migrating)

Gemini 2.5 Flash

Hybrid reasoning
Input
$0.30
Output
$2.50
Official
Usage
Model id gemini-2.5-flash—the first hybrid reasoning Flash with thinking budgets and a 1M-token context window.
Models
Standard text/image/video input $0.30/M tokens and $2.50/M output; Free tier available at no token charge.
Highlights
Grounding with Google Search shares 500 RPD (Free) / 1,500 RPD (Paid) free allowance with Flash-Lite.
Suited to production needing controllable thinking depth and 1M context without 3.5 Flash pricing.
Best for
1M context, controllable thinking, mid-cost production API

Gemini 3.1 Flash Image

Image generation
Input
$0.50
Output
$0.067
Official
Usage
Model id gemini-3.1-flash-image—for fast interactive image generation and editing; Standard text/image input $0.50/M tokens.
Models
Image output billed per token ($60/M image tokens): ~$0.045 at 0.5K, $0.067 at 1K, $0.101 at 2K, $0.151 at 4K—different from plain chat API billing.
Highlights
Batch API at $0.25/M input and $30/M image output (~half price); Paid tier only—suited to high-throughput visual generation pipelines.
Best for
Image generation, editing, and high-throughput visual API

Gemini 3.1 Flash Live

Real-time dialogue
Input
$0.75
Output
$4.50
Official
Usage
Model id gemini-3.1-flash-live-preview—low-latency audio-to-audio real-time dialogue with acoustic nuance, numeric precision, and multimodal awareness.
Models
Paid Standard: text $0.75 input / $4.50 output per MTok; audio $3 or ~$0.005/min input, $12 or ~$0.018/min output; image/video $1 or ~$0.002/min input.
Highlights
Free tier available with rate limits—suited to voice assistants, real-time translation, and voice-first agents; estimate audio/video minute costs before integration.
Best for
Real-time voice dialogue, Live API, and voice-first apps

Notes

  • Prices below are Paid tier Standard processing in USD per 1M tokens; Free tier in AI Studio offers free input/output on select models (content may improve products)—upgrade to Paid for production.
  • Gemini 3.1 Pro and 2.5 Pro use tiered pricing at ≤200K vs >200K prompt tokens (e.g. 3.1 Pro Standard: $2/$12 vs $4/$18 per MTok).
  • Batch API is ~50% off input/output; context caching adds storage fees (typically $0.50–$4.50 per 1M tokens/hour, model-dependent).
  • Image generation (3.1 Flash Image, etc.) and Live API audio/video use different billing from plain text chat—estimate per use case before integration.

Supported coding tools

Gemini APIGoogle AI StudioVertex AIContext CachingBatch APIGroundingLive API

Pricing and model data sourced from official vendor websites

General
7