NEW: Real-Time Usage Tracking for AI Agents — track Claude Code, Kimi, Codex & more. Try it free →

CostGoat Logo

CostGoat

LAST UPDATED: MAY 21, 2026

Kimi API Pricing Calculator & Complete Cost Guide

Calculate Kimi API costs for K2.6 (flagship), K2.5, and the legacy K2 family. OpenAI-compatible API with automatic caching and web search.

CalculatorPricing GuideExamplesSave MoneyFAQ

Pricing TLDR

  • $5 voucher when you reach $5 cumulative recharge
  • K2.6 (latest): $0.95/$4.00 per M tokens • K2.5: $0.60/$3.00 • Legacy K2 family: $0.60/$2.50 (EOL May 25, 2026)
  • Automatic caching: $0.10-$0.16/M tokens (~80-85% savings) • Web search: $0.005/call

Official pricing:

Moonshot AI

Quality Scores: Theozard

Kimi API Cost Calculator - Monthly Pricing

Calculate by

Input Tokens

Output Tokens

API Calls / Month

Quick Examples:

Cost Optimization:

Kimi K2.6 (kimi-k2.6)

Context

262K

Quality

89

Per 1M Tokens

In: $0.95

Out: $4.00

Monthly Cost

$2.95

Kimi K2.5 (kimi-k2.5)

Context

262K

Quality

78

Per 1M Tokens

In: $0.60

Out: $3.00

Monthly Cost

$2.10

Kimi K2 Thinking (kimi-k2-thinking)

Context

262K

Quality

68

Per 1M Tokens

In: $0.60

Out: $2.50

Monthly Cost

$1.85

Kimi K2 Thinking Turbo (kimi-k2-thinking-turbo)

Context

262K

Quality

68

Per 1M Tokens

In: $1.15

Out: $8.00

Monthly Cost

$5.15

Moonshot V1 Vision (8K) (moonshot-v1-8k-vision-preview)

Context

8K

Quality

51

Per 1M Tokens

In: $0.20

Out: $2.00

Monthly Cost

$1.20

Moonshot V1 (8K) (moonshot-v1-8k)

Context

8K

Quality

51

Per 1M Tokens

In: $0.20

Out: $2.00

Monthly Cost

$1.20

Kimi K2 0905 (kimi-k2-0905-preview)

Context

262K

Quality

51

Per 1M Tokens

In: $0.60

Out: $2.50

Monthly Cost

$1.85

Moonshot V1 Vision (32K) (moonshot-v1-32k-vision-preview)

Context

32K

Quality

51

Per 1M Tokens

In: $1.00

Out: $3.00

Monthly Cost

$2.50

Moonshot V1 (32K) (moonshot-v1-32k)

Context

32K

Quality

51

Per 1M Tokens

In: $1.00

Out: $3.00

Monthly Cost

$2.50

Kimi K2 Turbo (kimi-k2-turbo-preview)

Context

262K

Quality

51

Per 1M Tokens

In: $1.15

Out: $8.00

Monthly Cost

$5.15

Moonshot V1 Vision (128K) (moonshot-v1-128k-vision-preview)

Context

131K

Quality

51

Per 1M Tokens

In: $2.00

Out: $5.00

Monthly Cost

$4.50

Moonshot V1 (128K) (moonshot-v1-128k)

Context

131K

Quality

51

Per 1M Tokens

In: $2.00

Out: $5.00

Monthly Cost

$4.50

Kimi K2 0711 (kimi-k2-0711-preview)

Context

131K

Quality

44

Per 1M Tokens

In: $0.60

Out: $2.50

Monthly Cost

$1.85

Tired of manually checking your Kimi credits?

Track your Kimi agent quotas and API credits in real-time.

Try free for 7 daysLearn more →

Privacy-first desktop app. No sign-up required.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

About Kimi API

What is Kimi API?

Kimi API provides access to Moonshot AI's large language models. The current flagship is Kimi K2.6, a multimodal reasoning model with strong agent and coding performance. Kimi K2.5 remains available as a cheaper multimodal alternative, and the legacy K2 family (K2 0905, K2 Turbo, K2 Thinking, K2 0711) is scheduled for end-of-life on May 25, 2026. The API is fully compatible with OpenAI's SDK.

  • Multimodal Reasoning: Kimi K2.6 supports text and visual input natively with thinking and non-thinking modes. Strong on agent tasks, coding, and visual understanding.
  • OpenAI-Compatible API: Drop-in replacement for the OpenAI API. Use existing SDKs and tools with the api.moonshot.ai/v1 endpoint. Supports tool calls, JSON mode, and streaming.
  • Automatic Context Caching: Cached input is billed at $0.10/M on K2.5, $0.15/M on the legacy K2 family, and $0.16/M on K2.6 — roughly 80-85% off cache-miss input rates. No configuration required.

When to Use Kimi API

Choose Kimi K2.6 for the best current quality on agent and coding workloads. Use K2.5 when you want lower per-token cost and don't need K2.6's improvements. Migrate any traffic still on the legacy K2 family before May 25, 2026.

Ideal for

  • AI coding assistants and code generation
  • Autonomous AI agents with tool use
  • Complex multi-step reasoning
  • Cost-effective alternative to GPT-5.5 and Claude Opus 4.7
  • Applications requiring built-in web search integration

Not ideal for

  • Applications requiring <100ms latency
  • Use cases needing very long context (>262K tokens)
  • Tasks requiring guaranteed deterministic outputs

Kimi API Pricing Breakdown

Getting Started

Recharge at least $1 to activate your account. When your cumulative recharge reaches $5, you receive a $5 voucher bonus - effectively doubling your first $5 investment.

  • Sign up at platform.moonshot.ai
  • Recharge minimum $1 to activate API access
  • Receive $5 voucher at $5 cumulative recharge
  • OpenAI-compatible: use existing SDKs immediately

Cost Optimization Features

Automatic Context Caching (~80-85% Savings)

Cached input is billed at $0.10/M on K2.5, $0.15/M on the legacy K2 family, and $0.16/M on K2.6 — versus cache-miss input rates of $0.60-$0.95/M. Caching is automatic with no configuration needed. View 'context caching' costs in your console.

Web Search Integration ($0.005/call)

Enable the $web_search tool for real-time information. Charged only when search is triggered. Search results count toward token usage in subsequent calls.

Multimodal Input

K2.5 and K2.6 accept both text and images natively, with thinking and non-thinking modes. Image tokens follow the same per-million-token rates as text.

Tiered Rate Limits

Rate limits scale with cumulative recharge: Tier 1 ($10) gets 50 concurrent requests and 200 RPM. Tier 5 ($3000) gets 1000 concurrent and 10,000 RPM.

Kimi API Monthly Cost Estimates

Light Use

$5-40/mo

Personal projects

<1K requests/day

K2.5 with caching

Medium Use

$40-200/mo

Small apps

1-5K requests/day

K2.5 + K2.6 mix

Heavy Use

$200-800/mo

Production apps

5-20K requests/day

K2.6 with cache hits

Enterprise

$800+/mo

Large scale

20K+ requests/day

Contact sales for tier 5+

6 Kimi API Cost Optimization Tips

1

Leverage Automatic Caching

Kimi's automatic context caching drops input cost to $0.10-$0.16/M (roughly 80-85% off cache-miss rates). Design prompts with consistent system messages, retrieval context, and few-shot examples to maximize cache hits.

2

Choose K2.6 vs K2.5

Pick K2.6 ($0.95/$4.00) when you need the best Kimi quality for agent and coding workloads. Pick K2.5 ($0.60/$3.00) when cost matters more than the K2.6 quality bump and you mostly need multimodal text-and-image work.

3

Migrate Off the Legacy K2 Family

kimi-k2-0905-preview, kimi-k2-0711-preview, kimi-k2-turbo-preview, kimi-k2-thinking, and kimi-k2-thinking-turbo are scheduled for end-of-life on May 25, 2026. Move production traffic to K2.5 or K2.6 before that date to avoid breaking changes.

4

Optimize Web Search Usage

Web search costs $0.005 per call plus token costs for search results. Only enable $web_search when real-time information is needed. Search results add to input tokens in subsequent calls.

5

Scale Your Rate Limits

Recharge strategically to unlock higher tiers: $10 gets 50 concurrent/200 RPM, $100 gets 200 concurrent/5K RPM. Match your tier to actual throughput needs.

6

Monitor with CostGoat

Track Kimi API spending per model with CostGoat. Get alerts when cache hit rates drop, when K2 Turbo usage spikes, or when web search costs exceed thresholds.

Kimi Model Selection Guide

Use Case

Code Generation & Agents

Recommended Model

Kimi K2.6

Latest Flagship

Monthly Cost (Est.)

~$60-300

Why This Model?

Best Kimi quality for coding, agent loops, and tool use

Use Case

Cost-Optimized Multimodal

Recommended Model

Kimi K2.5

Multimodal + Thinking

Monthly Cost (Est.)

~$30-150

Why This Model?

Cheaper multimodal alternative when K2.6 quality isn't needed

Use Case

Image Understanding

Recommended Model

Kimi K2.6

Native Multimodal

Monthly Cost (Est.)

~$60-300

Why This Model?

Best vision quality in the Kimi lineup

Use Case

Simple Chatbot (cheapest)

Recommended Model

Moonshot V1 (8K)

Lowest Cost

Monthly Cost (Est.)

~$10-50

Why This Model?

Budget option for basic conversations

Use Case

Web-Augmented Tasks

Recommended Model

K2.6 + Web Search

Real-time Info

Monthly Cost (Est.)

~$70-350

Why This Model?

Current information for research tasks ($0.005/call)

Kimi API Rate Limits & Tiers

User Level

Tier 0

Cumulative Recharge

$1

Concurrency

1

RPM

3

TPM

500K

User Level

Tier 1

Cumulative Recharge

$10

Concurrency

50

RPM

200

TPM

2M

User Level

Tier 2

Cumulative Recharge

$20

Concurrency

100

RPM

500

TPM

3M

User Level

Tier 3

Cumulative Recharge

$100

Concurrency

200

RPM

5,000

TPM

3M

User Level

Tier 4

Cumulative Recharge

$1,000

Concurrency

400

RPM

5,000

TPM

4M

User Level

Tier 5

Cumulative Recharge

$3,000

Concurrency

1,000

RPM

10,000

TPM

5M

RPM: Requests Per Minute | TPM: Tokens Per Minute. Tier 0 has 1.5M tokens/day limit. Tier 1+ has unlimited daily tokens. Vouchers don't count toward cumulative recharge.

Start Tracking Your Kimi API Spending

Monitor your Kimi credit balance and K2 agent quotas — see remaining usage and reset countdowns at a glance.

Try Free for 7 DaysLearn more →

Privacy-first desktop app. 7-day free trial, no credit card required.

CostGoat desktop app showing AI agent quotas, usage costs, credit balances, and subscriptions

Kimi API Pricing FAQ

Common questions about Kimi API costs, billing, and optimization

AI Pricing

Gemini API PricingClaude API PricingGoogle Veo PricingAI Cost CalculatorsReplicate API PricingOpenRouter API PricingOpenRouter Free Models
DownloadsPricingDashboardContactIssuesAffiliatesTermsPrivacy

© 2026 CostGoat. All rights reserved.

Made by Functioncraft: Redis GUI Client · SSH GUI Client

Affiliate disclosure: Some links earn CostGoat a commission or credit when you sign up — no extra cost to you.