Kimi API Pricing Calculator & Complete Cost Guide
Calculate Kimi API costs for K2, K2 Thinking, and Moonshot models. OpenAI-compatible API with automatic caching and web search.
Pricing TLDR
- • $5 voucher when you reach $5 cumulative recharge
- • K2 models: $0.60/$2.50 per M tokens (input/output) • K2 Turbo: $1.15/$8.00 (60+ tok/s)
- • Automatic caching: $0.15/M tokens (75% savings) • Web search: $0.005/call
Official pricing:
Moonshot AIKimi API Cost Calculator - Monthly Pricing
Calculate by
Input Tokens
Output Tokens
API Calls / Month
Quick Examples:
Cost Optimization:
Kimi K2 Thinking (kimi-k2-thinking)
Context
Intelligence
Per 1M Tokens
In: $0.60
Out: $2.50
Monthly Cost
Kimi K2 Thinking Turbo (kimi-k2-thinking-turbo)
Context
Intelligence
Per 1M Tokens
In: $1.15
Out: $8.00
Monthly Cost
Kimi K2 0905 (kimi-k2-0905-preview)
Context
Intelligence
Per 1M Tokens
In: $0.60
Out: $2.50
Monthly Cost
Kimi K2 Turbo (kimi-k2-turbo-preview)
Context
Intelligence
Per 1M Tokens
In: $1.15
Out: $8.00
Monthly Cost
Kimi Latest (8K) (kimi-latest)
Context
Intelligence
Per 1M Tokens
In: $0.20
Out: $2.00
Monthly Cost
Moonshot V1 (8K) (moonshot-v1-8k)
Context
Intelligence
Per 1M Tokens
In: $0.20
Out: $2.00
Monthly Cost
Kimi K2 0711 (kimi-k2-0711-preview)
Context
Intelligence
Per 1M Tokens
In: $0.60
Out: $2.50
Monthly Cost
Kimi Latest (32K) (kimi-latest)
Context
Intelligence
Per 1M Tokens
In: $1.00
Out: $3.00
Monthly Cost
Moonshot V1 (32K) (moonshot-v1-32k)
Context
Intelligence
Per 1M Tokens
In: $1.00
Out: $3.00
Monthly Cost
Kimi Latest (128K) (kimi-latest)
Context
Intelligence
Per 1M Tokens
In: $2.00
Out: $5.00
Monthly Cost
Moonshot V1 (128K) (moonshot-v1-128k)
Context
Intelligence
Per 1M Tokens
In: $2.00
Out: $5.00
Monthly Cost
About Kimi API
What is Kimi API?
Kimi API provides access to Moonshot AI's large language models, including the flagship Kimi K2 - a 1 trillion parameter Mixture-of-Experts model with exceptional coding and agentic capabilities. Kimi K2 activates 32 billion parameters per request, achieving frontier performance while maintaining competitive pricing. The API is fully compatible with OpenAI's SDK.
- Frontier Agentic Model: Kimi K2 excels at coding, multi-step reasoning, and tool use. K2 Thinking adds deep reasoning capabilities for complex problem-solving tasks.
- OpenAI-Compatible API: Drop-in replacement for OpenAI API. Use existing SDKs and tools with api.moonshot.ai/v1 endpoint. Supports tool calls, JSON mode, and streaming.
- Automatic Context Caching: Built-in caching reduces input costs by 75%. No configuration needed - the system automatically caches repeated context for cost optimization.
When to Use Kimi API
Choose Kimi K2 for coding and agentic tasks where you need strong performance at budget-friendly prices. Use K2 Turbo for latency-sensitive applications, and K2 Thinking for complex multi-step reasoning.
Ideal for
- AI coding assistants and code generation
- Autonomous AI agents with tool use
- Complex reasoning and analysis tasks
- Cost-effective alternative to premium models
- Applications requiring web search integration
Not ideal for
- Image understanding (use kimi-latest for vision)
- Applications requiring <100ms latency
- Use cases needing very long context (>256K tokens)
- Tasks requiring guaranteed deterministic outputs
Kimi API Pricing Breakdown
Getting Started
Recharge at least $1 to activate your account. When your cumulative recharge reaches $5, you receive a $5 voucher bonus - effectively doubling your first $5 investment.
- Sign up at platform.moonshot.ai
- Recharge minimum $1 to activate API access
- Receive $5 voucher at $5 cumulative recharge
- OpenAI-compatible: use existing SDKs immediately
Cost Optimization Features
Automatic Context Caching (75% Savings)
Cached tokens cost only $0.15/M vs $0.60/M for K2 models. Caching is automatic with no configuration needed. View 'context caching' costs in your console.
Web Search Integration ($0.005/call)
Enable $web_search tool for real-time information. Only charged when search is triggered. Search results count toward token usage in subsequent calls.
K2 Turbo High-Speed Mode
60+ tokens/second output (max 100 tok/s) for latency-sensitive applications. Higher per-token cost but faster completion reduces wait time.
Tiered Rate Limits
Rate limits scale with cumulative recharge: Tier 1 ($10) gets 50 concurrent requests and 200 RPM. Tier 5 ($3000) gets 1000 concurrent and 10,000 RPM.
Kimi API Monthly Cost Estimates
Light Use
$5-30/mo
• Personal projects
• <1K requests/day
• K2 0905 recommended
Medium Use
$30-150/mo
• Small apps
• 1-5K requests/day
• K2 with caching
Heavy Use
$150-500/mo
• Production apps
• 5-20K requests/day
• K2 Turbo for speed
Enterprise
$500+/mo
• Large scale
• 20K+ requests/day
• Contact sales
6 Kimi API Cost Optimization Tips
Leverage Automatic Caching
Kimi's automatic context caching reduces input costs by 75% ($0.15/M vs $0.60/M for K2). Design prompts with consistent system messages and context to maximize cache hits.
Choose the Right K2 Variant
Use K2 0905 ($0.60/$2.50/M) for most tasks. Switch to K2 Turbo ($1.15/$8.00/M) only when speed matters. Use K2 Thinking for complex reasoning that justifies the output cost.
Use Context Length Wisely
kimi-latest auto-selects pricing tier by context: 8K ($0.20/$2.00), 32K ($1.00/$3.00), or 128K ($2.00/$5.00). Keep requests under 8K tokens when possible for lowest cost.
Optimize Web Search Usage
Web search costs $0.005 per call plus token costs for search results. Only enable $web_search when real-time information is needed. Search results add to input tokens in subsequent calls.
Scale Your Rate Limits
Recharge strategically to unlock higher tiers: $10 gets 50 concurrent/200 RPM, $100 gets 200 concurrent/5K RPM. Match your tier to actual throughput needs.
Monitor with CostGoat
Track Kimi API spending per model with CostGoat. Get alerts when cache hit rates drop, when K2 Turbo usage spikes, or when web search costs exceed thresholds.
Kimi Model Selection Guide
Use Case
Code Generation
Recommended Model
Kimi K2 0905
Enhanced Agentic Coding
Monthly Cost (Est.)
~$30-150
Why This Model?
Best coding performance at lowest K2 price
Use Case
AI Agents
Recommended Model
Kimi K2 Turbo
High-Speed
Monthly Cost (Est.)
~$50-200
Why This Model?
Fast responses for interactive agent loops
Use Case
Complex Reasoning
Recommended Model
Kimi K2 Thinking
Deep Reasoning
Monthly Cost (Est.)
~$50-250
Why This Model?
Multi-step reasoning with tool use
Use Case
Image Understanding
Recommended Model
Kimi Latest
Vision + Latest
Monthly Cost (Est.)
~$20-100
Why This Model?
Only model with vision capabilities
Use Case
Simple Chatbot
Recommended Model
Moonshot V1 (8K)
Lowest Cost
Monthly Cost (Est.)
~$10-50
Why This Model?
Budget option for basic conversations
Use Case
Web-Augmented Tasks
Recommended Model
K2 + Web Search
Real-time Info
Monthly Cost (Est.)
~$40-200
Why This Model?
Current information for research tasks
Kimi API Rate Limits & Tiers
User Level
Cumulative Recharge
Concurrency
RPM
TPM
User Level
Cumulative Recharge
Concurrency
RPM
TPM
User Level
Cumulative Recharge
Concurrency
RPM
TPM
User Level
Cumulative Recharge
Concurrency
RPM
TPM
User Level
Cumulative Recharge
Concurrency
RPM
TPM
User Level
Cumulative Recharge
Concurrency
RPM
TPM
RPM: Requests Per Minute | TPM: Tokens Per Minute. Tier 0 has 1.5M tokens/day limit. Tier 1+ has unlimited daily tokens. Vouchers don't count toward cumulative recharge.
Track Your LLM API Costs in Real-Time
Monitor your AI API spending across OpenAI, Anthropic, Google, and other LLM providers. CostGoat is a privacy-first desktop app that tracks usage in real-time. 7-day free trial, then $9/month.
Start Free TrialKimi API Pricing FAQ
Common questions about Kimi API costs, billing, and optimization
