How much does Kimi API cost per month?

Monthly costs depend on model and volume. With Kimi K2.6 (current flagship) at $0.95/$4.00 per M tokens: Light use $5-40/mo, Medium use $40-200/mo, Heavy use $200-800/mo. K2.5 ($0.60/$3.00) is around 30% cheaper for multimodal work. Cache-hit input drops to $0.10-$0.16/M, which can cut monthly costs another 60-80% on workloads with repeated context.

How to get a Kimi API key?

1) Sign up at platform.moonshot.ai, 2) Recharge at least $1 to activate your account, 3) Open the Console and create an API key, 4) Use the OpenAI-compatible endpoint: api.moonshot.ai/v1. Kimi API works with existing OpenAI SDKs — just swap the base URL and key.

Why is Kimi so cheap?

Kimi uses a Mixture-of-Experts (MoE) architecture that activates only a fraction of parameters per request, so Moonshot pays for less compute per token. Automatic context caching saves another 80%+ on repeated content. The result: frontier-quality output at $0.95/$4.00 per M tokens on K2.6 — well below current OpenAI and Anthropic flagships.

Does Kimi API support caching?

Yes, Kimi API has automatic context caching. Cached input tokens are billed at $0.10/M on K2.5, $0.15/M on the legacy K2 series, and $0.16/M on K2.6 — roughly 80-85% off the cache-miss input rate. Caching is automatic with no configuration required; check 'context caching' in your console for per-model cost details.

LAST UPDATED: MAY 21, 2026

Kimi API Pricing Calculator & Complete Cost Guide

Q: Is Kimi K2 API free?

Kimi API is not free but very affordable. You need to recharge at least $1 to start. When your cumulative recharge reaches $5, you receive a $5 voucher bonus — effectively doubling your first $5. At $0.95/$4.00 (K2.6) or $0.60/$3.00 (K2.5) per million tokens, Kimi is one of the cheapest frontier-quality model families on the market.

Calculate Kimi API costs for K2.6 (flagship), K2.5, and the legacy K2 family. OpenAI-compatible API with automatic caching and web search.

Calculator Pricing Guide Examples Save Money FAQ

Pricing TLDR

• $5 voucher when you reach $5 cumulative recharge
• K2.6 (latest): $0.95/$4.00 per M tokens • K2.5: $0.60/$3.00 • Legacy K2 family: $0.60/$2.50 (EOL May 25, 2026)
• Automatic caching: $0.10-$0.16/M tokens (~80-85% savings) • Web search: $0.005/call

Official pricing:

Moonshot AI

•

Quality Scores: Theozard

Kimi API Cost Calculator - Monthly Pricing

Calculate by

TokensWordsCharacters

Input Tokens

Output Tokens

API Calls / Month

Quick Examples:

Cost Optimization:

Context Caching(75% off input)

Web Search($0.005/call)

Kimi K2.6 (kimi-k2.6)

Context

262K

Quality

Per 1M Tokens

In: $0.95

Out: $4.00

Monthly Cost

$2.95

Kimi K2.5 (kimi-k2.5)

Context

262K

Quality

Per 1M Tokens

In: $0.60

Out: $3.00

Monthly Cost

$2.10

Kimi K2 Thinking (kimi-k2-thinking)

Context

262K

Quality

Per 1M Tokens

In: $0.60

Out: $2.50

Monthly Cost

$1.85

Kimi K2 Thinking Turbo (kimi-k2-thinking-turbo)

Context

262K

Quality

Per 1M Tokens

In: $1.15

Out: $8.00

Monthly Cost

$5.15

Moonshot V1 Vision (8K) (moonshot-v1-8k-vision-preview)

Context

Quality

Per 1M Tokens

In: $0.20

Out: $2.00

Monthly Cost

$1.20

Moonshot V1 (8K) (moonshot-v1-8k)

Context

Quality

Per 1M Tokens

In: $0.20

Out: $2.00

Monthly Cost

$1.20

Kimi K2 0905 (kimi-k2-0905-preview)

Context

262K

Quality

Per 1M Tokens

In: $0.60

Out: $2.50

Monthly Cost

$1.85

Moonshot V1 Vision (32K) (moonshot-v1-32k-vision-preview)

Context

32K

Quality

Per 1M Tokens

In: $1.00

Out: $3.00

Monthly Cost

$2.50

Moonshot V1 (32K) (moonshot-v1-32k)

Context

32K

Quality

Per 1M Tokens

In: $1.00

Out: $3.00

Monthly Cost

$2.50

Kimi K2 Turbo (kimi-k2-turbo-preview)

Context

262K

Quality

Per 1M Tokens

In: $1.15

Out: $8.00

Monthly Cost

$5.15

Moonshot V1 Vision (128K) (moonshot-v1-128k-vision-preview)

Context

131K

Quality

Per 1M Tokens

In: $2.00

Out: $5.00

Monthly Cost

$4.50

Moonshot V1 (128K) (moonshot-v1-128k)

Context

131K

Quality

Per 1M Tokens

In: $2.00

Out: $5.00

Monthly Cost

$4.50

Kimi K2 0711 (kimi-k2-0711-preview)

Context

131K

Quality

Per 1M Tokens

In: $0.60

Out: $2.50

Monthly Cost

$1.85

Tired of manually checking your Kimi credits?

Track your Kimi agent quotas and API credits in real-time.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

About Kimi API

What is Kimi API?

Kimi API provides access to Moonshot AI's large language models. The current flagship is Kimi K2.6, a multimodal reasoning model with strong agent and coding performance. Kimi K2.5 remains available as a cheaper multimodal alternative, and the legacy K2 family (K2 0905, K2 Turbo, K2 Thinking, K2 0711) is scheduled for end-of-life on May 25, 2026. The API is fully compatible with OpenAI's SDK.

Multimodal Reasoning: Kimi K2.6 supports text and visual input natively with thinking and non-thinking modes. Strong on agent tasks, coding, and visual understanding.
OpenAI-Compatible API: Drop-in replacement for the OpenAI API. Use existing SDKs and tools with the api.moonshot.ai/v1 endpoint. Supports tool calls, JSON mode, and streaming.
Automatic Context Caching: Cached input is billed at $0.10/M on K2.5, $0.15/M on the legacy K2 family, and $0.16/M on K2.6 — roughly 80-85% off cache-miss input rates. No configuration required.

When to Use Kimi API

Choose Kimi K2.6 for the best current quality on agent and coding workloads. Use K2.5 when you want lower per-token cost and don't need K2.6's improvements. Migrate any traffic still on the legacy K2 family before May 25, 2026.

Ideal for

AI coding assistants and code generation
Autonomous AI agents with tool use
Complex multi-step reasoning
Cost-effective alternative to GPT-5.5 and Claude Opus 4.7
Applications requiring built-in web search integration

Not ideal for

Applications requiring <100ms latency
Use cases needing very long context (>262K tokens)
Tasks requiring guaranteed deterministic outputs

Kimi API Pricing Breakdown

Getting Started

Recharge at least $1 to activate your account. When your cumulative recharge reaches $5, you receive a $5 voucher bonus - effectively doubling your first $5 investment.

Sign up at platform.moonshot.ai
Recharge minimum $1 to activate API access
Receive $5 voucher at $5 cumulative recharge
OpenAI-compatible: use existing SDKs immediately

Cost Optimization Features

Automatic Context Caching (~80-85% Savings)

Cached input is billed at $0.10/M on K2.5, $0.15/M on the legacy K2 family, and $0.16/M on K2.6 — versus cache-miss input rates of $0.60-$0.95/M. Caching is automatic with no configuration needed. View 'context caching' costs in your console.

Web Search Integration ($0.005/call)

Enable the $web_search tool for real-time information. Charged only when search is triggered. Search results count toward token usage in subsequent calls.

Multimodal Input

K2.5 and K2.6 accept both text and images natively, with thinking and non-thinking modes. Image tokens follow the same per-million-token rates as text.

Tiered Rate Limits

Rate limits scale with cumulative recharge: Tier 1 ($10) gets 50 concurrent requests and 200 RPM. Tier 5 ($3000) gets 1000 concurrent and 10,000 RPM.

Kimi API Monthly Cost Estimates

Light Use

$5-40/mo

• Personal projects

• <1K requests/day

• K2.5 with caching

Medium Use

$40-200/mo

• Small apps

• 1-5K requests/day

• K2.5 + K2.6 mix

Heavy Use

$200-800/mo

• Production apps

• 5-20K requests/day

• K2.6 with cache hits

Enterprise

$800+/mo

• Large scale

• 20K+ requests/day

• Contact sales for tier 5+

6 Kimi API Cost Optimization Tips

Leverage Automatic Caching

Kimi's automatic context caching drops input cost to $0.10-$0.16/M (roughly 80-85% off cache-miss rates). Design prompts with consistent system messages, retrieval context, and few-shot examples to maximize cache hits.

Choose K2.6 vs K2.5

Pick K2.6 ($0.95/$4.00) when you need the best Kimi quality for agent and coding workloads. Pick K2.5 ($0.60/$3.00) when cost matters more than the K2.6 quality bump and you mostly need multimodal text-and-image work.

Migrate Off the Legacy K2 Family

kimi-k2-0905-preview, kimi-k2-0711-preview, kimi-k2-turbo-preview, kimi-k2-thinking, and kimi-k2-thinking-turbo are scheduled for end-of-life on May 25, 2026. Move production traffic to K2.5 or K2.6 before that date to avoid breaking changes.

Optimize Web Search Usage

Web search costs $0.005 per call plus token costs for search results. Only enable $web_search when real-time information is needed. Search results add to input tokens in subsequent calls.

Scale Your Rate Limits

Recharge strategically to unlock higher tiers: $10 gets 50 concurrent/200 RPM, $100 gets 200 concurrent/5K RPM. Match your tier to actual throughput needs.

Monitor with CostGoat

Track Kimi API spending per model with CostGoat. Get alerts when cache hit rates drop, when K2 Turbo usage spikes, or when web search costs exceed thresholds.

Kimi Model Selection Guide

Use Case

Code Generation & Agents

Recommended Model

Kimi K2.6

Latest Flagship

Monthly Cost (Est.)

~$60-300

Why This Model?

Best Kimi quality for coding, agent loops, and tool use

Use Case

Cost-Optimized Multimodal

Recommended Model

Kimi K2.5

Multimodal + Thinking

Monthly Cost (Est.)

~$30-150

Why This Model?

Cheaper multimodal alternative when K2.6 quality isn't needed

Use Case

Image Understanding

Recommended Model

Kimi K2.6

Native Multimodal

Monthly Cost (Est.)

~$60-300

Why This Model?

Best vision quality in the Kimi lineup

Use Case

Simple Chatbot (cheapest)

Recommended Model

Moonshot V1 (8K)

Lowest Cost

Monthly Cost (Est.)

~$10-50

Why This Model?

Budget option for basic conversations

Use Case

Web-Augmented Tasks

Recommended Model

K2.6 + Web Search

Real-time Info

Monthly Cost (Est.)

~$70-350

Why This Model?

Current information for research tasks ($0.005/call)

Kimi API Rate Limits & Tiers

User Level

Tier 0

Cumulative Recharge

Concurrency

RPM

TPM

500K

User Level

Tier 1

Cumulative Recharge

$10

Concurrency

RPM

200

TPM

User Level

Tier 2

Cumulative Recharge

$20

Concurrency

100

RPM

500

TPM

User Level

Tier 3

Cumulative Recharge

$100

Concurrency

200

RPM

5,000

TPM

User Level

Tier 4

Cumulative Recharge

$1,000

Concurrency

400

RPM

5,000

TPM

User Level

Tier 5

Cumulative Recharge

$3,000

Concurrency

1,000

RPM

10,000

TPM

RPM: Requests Per Minute | TPM: Tokens Per Minute. Tier 0 has 1.5M tokens/day limit. Tier 1+ has unlimited daily tokens. Vouchers don't count toward cumulative recharge.

Start Tracking Your Kimi API Spending

Monitor your Kimi credit balance and K2 agent quotas — see remaining usage and reset countdowns at a glance.

Kimi API Pricing FAQ

Common questions about Kimi API costs, billing, and optimization

Kimi API Pricing Calculator & Complete Cost Guide

Kimi API Cost Calculator - Monthly Pricing

Track your Kimi agent quotas and API credits in real-time.

About Kimi API

What is Kimi API?

When to Use Kimi API

Ideal for

Not ideal for

Kimi API Pricing Breakdown

Getting Started

Cost Optimization Features

Kimi API Monthly Cost Estimates

6 Kimi API Cost Optimization Tips

Kimi Model Selection Guide

Kimi API Rate Limits & Tiers

Start Tracking Your Kimi API Spending

Kimi API Pricing FAQ

Is Kimi K2 API free?

How to get a Kimi API key?

Why is Kimi so cheap?

Does Kimi API support caching?

Related Pricing Calculators