智谱 GLM API
GLM-5.2 flagship pay-as-you-go API with tiered pricing, cache hits, and full multimodal coverage
Core models
New flagship text model with 1M context and open-source SOTA coding—more stable long-horizon execution; supports thinking mode, tools, and MCP.
Previous-generation flagship text model with 200K context, thinking mode, tools, and MCP—still top-tier long-horizon and coding capability.
Text base optimized for complex long tasks and agents with 200K context and strong continuity—list price slightly below GLM-5.2.
Multimodal coding base accepting image/video/file/text with 200K context—for visual agents and frontend replication.
Mainstream high-intelligence text model with 200K context—balanced coding, reasoning, and agents with tiered input/output pricing.
Additional core model names still appear above, with full details on the latest official page.
Plan details
GLM-5.2
New flagshipRecommendedSuited to complex agents, long-horizon coding, project-scale delivery, and production backends needing top reasoning; Coding Plan legacy GLM-5.1/GLM-5 calls auto-route here.
Suited to complex agents, long-horizon coding, high-quality doc/PPT production, and production backends needing top reasoning.
A default daily coding model in Coding Plan and a common production default for pay-as-you-go API users.
For workloads with mostly short input and output, actual bills can stay very low over time.
GLM-5V-Turbo
Multimodal codingCache-hit input ¥1.2 per 1M tokens (<32K band)—long multimodal sessions also benefit from caching.
After validation, migrate to paid tiers like GLM-4.7 or GLM-4.5-Air for stable SLA and higher concurrency.
Notes
- `entryPrice` and tier prices here are lowest-band representative quotes, not final bills for long context. GLM-5.2: ¥8/¥28 (cache hit ¥2, 1M context). GLM-5.1: <32K input ¥6/¥24 (hit ¥1.3), ≥32K ¥8/¥28 (hit ¥2). GLM-4.7 also tiers by output ratio: <32K & <20% output ¥2/¥8, same input ≥20% output ¥3/¥14, 32K–200K input ¥4/¥16. GLM-4.5-Air: <32K low band ¥0.8/¥2, 32K–128K ¥1.2/¥8. Each request bills at its matching band.
- Free models such as GLM-4.7-Flash, GLM-4-Flash-250414, GLM-4.6V-Flash, GLM-4.1V-Thinking-Flash, GLM-4V-Flash, CogView-3-Flash, and CogVideoX-Flash remain in the API catalog subject to rate and fair-use rules.
- GLM-Image ¥0.1/request, CogView-4 ¥0.06/request; CogVideoX-3 ¥1/request. GLM-TTS, GLM-4-Voice, etc. are listed under speech on the pricing page.
- Team Coding Plan overage bills at 90% of API list price via team keys; standard Open Platform API keys charge account balance in real time.
Supported coding tools
Pricing and model data sourced from official vendor websites