Kimi API
Kimi open API with K2.7 Code coding flagship, K2.6 multimodal, and Moonshot V1 classic series, billed per token
Core models
Current API coding flagship `kimi-k2.7-code`—steadier long-context instruction following and higher coding success, reasoning-only; ¥1.30 cache hit, ¥6.50 uncached input, ¥27.00 output per 1M tokens, 262,144-token context.
HighSpeed variant `kimi-k2.7-code-highspeed` shares the standard model with ~180 tokens/s output (up to 260 tokens/s short context); ¥2.60 / ¥13.00 / ¥54.00 per 1M tokens for latency-sensitive dev workloads.
Current API flagship `kimi-k2.6`—stronger long-horizon code and agent execution with vision/video input and reasoning modes; ¥1.10 cache hit, ¥6.50 uncached input, ¥27.00 output per 1M tokens, 262,144-token context.
Mainstream production model `kimi-k2.5`—256k context with coding, tools, and `$web_search`; ¥0.70 cache hit, ¥4.00 uncached input, ¥21.00 output per 1M tokens, strong as a long-term default.
Classic long-text models `moonshot-v1-128k` and `moonshot-v1-128k-vision-preview`—¥10.00 input, ¥30.00 output per 1M tokens, 131,072-token context; Vision supports image understanding at the same tier.
Additional core model names still appear above, with full details on the latest official page.
Plan details
Kimi K2.7 Code
Coding flagshipkimi-k2.7-code is Kimi’s most capable coding model yet—more reliable instruction following in long contexts and higher programming success rates; reasoning-only, suited to complex engineering, multi-step agents, and long-horizon refactors.kimi-k2.7-code-highspeed shares the same model with faster output (~180 tokens/s, up to 260 tokens/s in short context) at double the unit price—better for latency-sensitive dev interactions.Uncached input at ¥6.50 and output at ¥27.00 per 1M tokens matches K2.6 pricing, but K2.7 Code targets coding—route everyday volume to K2.5 and switch to K2.7 Code for complex code checkpoints.
kimi-k2.6 is the most capable API model today, with stronger and steadier long-horizon coding, better instruction following, and improved self-correction—best for routing complex engineering work to the flagship.Priced above K2.5, it behaves more like a key-path model than a universal default—reserve it for complex, important, low-tolerance tasks while routing everyday volume to K2.5.
Kimi K2.5
Production workhorseRecommendedkimi-k2.5 is the mainstream production model with balanced agent, coding, vision, and general intelligence performance; cached input at just ¥0.70 per 1M tokens makes it better for steady high-volume use.For a production balance across cost, long-context capability, and feature completeness, K2.5 is usually more practical than sending all traffic to the flagship—route only complex checkpoints to K2.6.
Moonshot V1
Classic generationmoonshot-v1-8k costs ¥2.00 input and ¥10.00 output per 1M tokens for short text and low-cost high-frequency calls; moonshot-v1-32k is ¥5/¥20 for medium-length generation.moonshot-v1-128k costs ¥10.00 input and ¥30.00 output per 1M tokens with 131,072-token context for longer documents; Vision variants (8K/32K/128K preview) bill at the same tier as text.If you only need classic text or image understanding with a clear context-length tier, the V1 family is usually cheaper than K2 multimodal flagships; complex agents and long-horizon code still fit K2.5 / K2.6 better.
Notes
- Pricing is per 1M tokens: K2.7 Code at ¥1.30 / ¥6.50 / ¥27.00, HighSpeed at ¥2.60 / ¥13.00 / ¥54.00; K2.6 at ¥1.10 / ¥6.50 / ¥27.00; K2.5 at ¥0.70 / ¥4.00 / ¥21.00; Moonshot V1 by 8K/32K/128K and Vision variants.
- A limited recharge bonus runs 2026-06-12 through 2026-07-02: top up ¥500+ for up to 30% voucher bonus per platform.kimi.com/docs/pricing/promotion—end date per official notice.
- Each successful `$web_search` trigger costs an extra ¥0.03, and search-result tokens are counted on the next `/chat/completions` call—include tool costs when estimating agent flows.
- File extraction and storage APIs are temporarily free, but extracted document content billed as model input tokens; uploading files alone does not incur file API charges.
Supported coding tools
Pricing and model data sourced from official vendor websites