MiMo API

From ¥1 per 1M tokens

MiMo pay-as-you-go API with separate pricing for V2.5 text, ASR, TTS, and web search, independent from Token Plan quota

Token API

Core models

mimo-v2.5-promimo-v2.5mimo-v2.5-asrmimo-v2.5-ttsmimo-v2.5-tts-voiceclonemimo-v2.5-tts-voicedesign

mimo-v2.5-pro

This is the current flagship pay-as-you-go text model in the official docs. It is priced domestically at ¥0.025 for cache hits, ¥3 for uncached input, and ¥6 for output per 1M tokens, or $0.0036, $0.435, and $0.87 overseas, which makes it better suited to complex engineering, critical content generation, and high-value flows where per-call quality matters more.

mimo-v2.5

This is the current mainline standard pay-as-you-go text model, priced domestically at ¥0.02 for cache hits, ¥1 for uncached input, and ¥2 for output per 1M tokens, or $0.0028, $0.14, and $0.28 overseas. Compared with the flagship tier, it is better suited to day-to-day large-scale text usage, knowledge assistants, and routine coding support.

mimo-v2.5-asr

The docs list this as the current ASR model, and unlike text models it is billed by input audio duration at ¥0.5/hour domestically or $0.074/hour overseas. For meeting notes, voice input, and support-call analysis, it acts as the key speech-ingestion capability inside the MiMo platform.

mimo-v2.5-tts-voiceclone

VoiceClone is one of the explicitly listed TTS-series models in the official docs and is currently temporarily free. It is more suitable for turning text results into speech that feels more personalized, character-driven, or closer to a specific vocal identity, which makes it valuable for voice assistants, companion-style interactions, and dubbing workflows.

mimo-v2.5-tts-voicedesign

VoiceDesign is also a currently supported and temporarily free TTS-series model. It is better suited to scenarios that need speech-style design, narrated product experiences, or voice-product prototypes, showing that MiMo’s pay-as-you-go platform covers not only text and ASR but also the speech-output layer.

See the official site for more models

Additional core model names still appear above, with full details on the latest official page.

Open official site

Plan details

mimo-v2.5

Main text modelRecommended

Input

¥1

Output

¥2

Official

mimo-v2.5

Main text modelRecommended

Input

¥1

Output

¥2

Official

Usage

The official docs position mimo-v2.5 as one of the current main pay-as-you-go text models, priced domestically at ¥0.02 for cache hits, ¥1 for uncached input, and ¥2 for output per 1M tokens, making it the more scalable entry point for cost-sensitive, high-frequency text workloads.

Models

Its overseas pricing of $0.0028, $0.14, and $0.28 per 1M tokens keeps it meaningfully below the flagship tier across regions, which makes it more suitable for knowledge assistants, general chat, content generation, and day-to-day coding support that must run stably at volume.

Highlights

The docs also emphasize that the normal API is fully separate from Token Plan, so what you get here is the standard open-platform pay-as-you-go capability, better suited to backend services, custom apps, and automated workflows than the Credits-based coding subscription.

Best for

API teams that need a low-cost main text model

mimo-v2.5-pro

Flagship text model

Input

¥3

Output

¥6

Official

mimo-v2.5-pro

Flagship text model

Input

¥3

Output

¥6

Official

Usage

The official docs list mimo-v2.5-pro as the current flagship pay-as-you-go text model, priced domestically at ¥0.025 for cache hits, ¥3 for uncached input, and ¥6 for output per 1M tokens, which is clearly above the standard model and better suited to smaller volumes of high-value workloads.

Models

Its overseas pricing of $0.0036, $0.435, and $0.87 per 1M tokens preserves a clear gap from mimo-v2.5, which makes it more reasonable for complex software engineering, critical result generation, and multi-step agent work where per-call quality matters more.

Highlights

Because the official docs also publish cache-hit pricing, the flagship model does not have to be limited to ultra-low-frequency use if your workload can reliably benefit from Prompt Cache and reduce some of the cost pressure through reuse.

Best for

Complex generation and engineering scenarios with higher quality demands

mimo-v2.5-asr

ASR

Input

¥0.5

Output

hour

Official

mimo-v2.5-asr

ASR

Input

¥0.5

Output

hour

Official

Usage

The official pricing for mimo-v2.5-asr is explicit: ¥0.5/hour domestically and $0.074/hour overseas, prorated from second-level measurement, which means it follows a different cost model than text generation and is easier to budget separately as a voice-input layer.

Models

Since it is not token-billed, workloads such as meeting transcription, support-call analysis, and voice-input understanding become easier to estimate than standard LLM usage, and it is also simpler to separate ASR costs from downstream text-generation costs.

Highlights

If your product first converts audio into text and then hands it to MiMo text models, this ASR tier becomes the key bridge that brings voice workflows into the same open platform without needing a completely separate speech stack.

Best for

API scenarios that need voice input or transcription

TTS 系列

Temporarily free

Input

Temporarily free

Output

Includes mimo-v2.5-tts, voiceclone, and voicedesign

Official

TTS 系列

Temporarily free

Input

Temporarily free

Output

Includes mimo-v2.5-tts, voiceclone, and voicedesign

Official

Usage

The platform currently marks mimo-v2.5-tts, mimo-v2.5-tts-voiceclone, and mimo-v2.5-tts-voicedesign as temporarily free, which gives MiMo’s pay-as-you-go platform a notably favorable window for trying and integrating voice output.

Models

That does not make them unimportant; on the contrary, TTS, VoiceClone, and VoiceDesign are explicitly listed as available models, but are currently free, making now a good time to validate narration, dubbing, voice-assistant feedback, and multimodal interaction experiences.

Highlights

For teams building products that combine text generation with voice output, the value of this tier is that you can connect the speech layer first and then decide on longer-term usage later, instead of forcing an estimate through the cost logic of text models from the start.

Best for

Teams needing narration, dubbing, or voice-output capabilities

Notes

Domestic pricing is in CNY per 1M tokens and overseas pricing is in USD per 1M tokens; web search plugins cost ¥16/1000 calls domestically and $5/1000 calls overseas, billed separately from token pricing.
The TTS family and cache writes are currently temporarily free with no published end date; ASR costs ¥0.5/hour domestically and $0.074/hour overseas. MiMo-V2.5 price cuts took effect on 2026-05-27.
New integrations should use the V2.5 family directly. V2 migration: `mimo-v2-pro` and `mimo-v2-omni` auto-route to V2.5 pricing since 2026-06-01; `mimo-v2-flash` and `mimo-v2-tts` from 2026-06-18; the V2 family retires 2026-06-30.

Supported coding tools

OpenAI APIAnthropic APIMiMo API

Pricing and model data sourced from official vendor websites

FAQ

General·8

General

8 条