🛒 CYBER WEEK SALE: Get lifetime for $199 (33% OFF) with code BFCM25 — ends Dec 7! Claim your deal

CostGoat Logo

CostGoat

LAST UPDATED: DECEMBER 2, 2025

Gemini API Pricing Calculator & Complete Cost Guide

Calculate Gemini API costs for text and chat models per token and per month. Compare 6 models with batch & cache discounts.

CalculatorPricing GuideExamplesSave MoneyFAQ

Pricing TLDR

  • • Generous free tiers for most models (rate-limited)
  • • Pay-per-token: Flash-Lite ($0.10/$0.40) - Pro ($1.25/$10) - 3 Pro ($2/$12) per million tokens
  • • 50% batch discount - 90% context caching savings - Long context 2x pricing

Official pricing:

Google AI

Gemini API Cost Calculator - Monthly Pricing

Calculate by

Input Tokens

Output Tokens

API Calls / Month

Quick Examples:

Cost Optimization:

Gemini 3 Pro Preview (gemini-3-pro-preview)

Context

1M

Intelligence

73

Per 1M Tokens

In: $2.00

Out: $12.00

Monthly Cost

$8.00

Gemini 2.5 Pro (gemini-2.5-pro)

Context

1M

Intelligence

60

Per 1M Tokens

In: $1.25

Out: $10.00

Monthly Cost

$6.25

Gemini 2.5 Flash (gemini-2.5-flash)

Context

1M

Intelligence

51

Per 1M Tokens

In: $0.30

Out: $2.50

Monthly Cost

$1.55

Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)

Context

1M

Intelligence

40

Per 1M Tokens

In: $0.10

Out: $0.40

Monthly Cost

$0.30

Gemini 2.0 Flash (gemini-2.0-flash)

Context

1M

Intelligence

34

Per 1M Tokens

In: $0.10

Out: $0.40

Monthly Cost

$0.30

Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)

Context

1M

Intelligence

27

Per 1M Tokens

In: $0.08

Out: $0.30

Monthly Cost

$0.23

About Gemini API

What is Gemini API?

The Gemini API provides programmatic access to Google's family of multimodal AI models, designed for a wide range of tasks from simple chat to complex reasoning. Gemini models (Pro, Flash, Flash-Lite) offer different capability and price tiers, with generous free tiers and competitive paid pricing.

  • Multiple Model Tiers: Choose from Gemini 3 Pro Preview (most powerful), Gemini 2.5 Pro (best for coding), Gemini 2.5 Flash (hybrid reasoning), Gemini 2.5 Flash-Lite (cost-effective), and Gemini 2.0 Flash/Flash-Lite (balanced/fastest) to match your needs.
  • Massive Context Windows: All Gemini models support 1M token context windows (~750K words). Pro models have tiered pricing: standard rates for ≤200K tokens, 2x rates for >200K tokens.
  • Token-Based Pricing: Pay only for what you use with separate pricing for input and output tokens. Output typically costs 3-10x more than input depending on model. Generous free tiers available for most models.

When to Use Gemini API

Start with Gemini 2.0 Flash-Lite or 2.5 Flash-Lite for simple tasks and cost-effectiveness, then upgrade to Pro models as complexity increases. Use batch processing and context caching to significantly reduce costs for production workloads.

Ideal for

  • Customer support chatbots with natural conversation flow
  • Code completion and review in development environments
  • Large document summarization and analysis (1M context)
  • Creative content generation for marketing and blogs
  • Grounded responses using Google Search integration

Not ideal for

  • Real-time applications requiring <100ms latency
  • Simple pattern matching tasks (use cheaper alternatives)
  • Applications needing guaranteed deterministic outputs
  • Tasks requiring specialized domain knowledge without context

Gemini API Pricing Breakdown

Free Tier

Most Gemini models offer generous free tiers with rate limits. Free tier usage is subject to Google using your data to improve products. No credit card required to get started.

  • Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite: Free with rate limits
  • Gemini 2.0 Flash, 2.0 Flash-Lite: Free with rate limits
  • Google Search grounding: 500 RPD free (Flash models)
  • Gemini 3 Pro Preview: Paid tier only (no free access)

Cost Optimization Features

Batch API (50% Discount)

Process non-urgent workloads asynchronously at half price. Example: Gemini 2.5 Pro drops to $0.625/$5 per million tokens. Perfect for data processing and content generation. Not available on free tier.

Context Caching (90% Savings)

Cache frequently used prompts, system messages, or documents. Cache reads cost 10% of base input price. Storage costs $1-4.50 per million tokens per hour depending on model.

Grounding with Google Search

Get up-to-date information by grounding responses with Google Search. 500-1,500 RPD free depending on tier, then $35 per 1,000 grounded prompts. Google Maps grounding also available.

Tiered Context Pricing

Pro models (2.5 Pro, 3 Pro) have standard pricing for prompts ≤200K tokens and 2x pricing for prompts >200K tokens. Flash models have flat pricing regardless of context length.

Gemini API Monthly Cost Estimates

Light Use

$0-30/mo

• Personal projects

• <1K requests/day

• Flash-Lite free tier

Medium Use

$30-150/mo

• Small apps

• 1-5K requests/day

• Flash or Flash-Lite

Heavy Use

$150-800/mo

• Production apps

• 5-20K requests/day

• Mixed models

Enterprise

$800+/mo

• Large scale

• 20K+ requests/day

• Vertex AI available

6 Gemini API Cost Optimization Tips

1

Use Context Caching

Save 90% on repeated content by caching frequently used prompts, system messages, or documents. Cache reads cost just 10% of base input price. Storage costs $1-4.50/M tokens/hour. Example: 10K requests with 80% cache hits saves significant costs on Pro models.

2

Enable Batch API

Get 50% discount on all paid models by processing non-urgent workloads asynchronously. Gemini 2.5 Pro drops to $0.625/$5 per M tokens (vs $1.25/$10 standard). Perfect for data processing, content generation, and analysis tasks.

3

Start with Flash-Lite

Use the cheapest models (Gemini 2.0 Flash-Lite at $0.075/$0.30 or 2.5 Flash-Lite at $0.10/$0.40 per M tokens) for classification and routing. Only escalate to expensive Pro models when necessary for complex tasks.

4

Leverage Free Tiers

Most Gemini models have generous free tiers with rate limits. Use free tier for development and testing, then upgrade to paid for production. Free tier data may be used to improve Google products.

5

Optimize Context Length

Pro models charge 2x for prompts >200K tokens. Keep prompts under 200K when possible, or use Flash models which have flat pricing regardless of context length for long document processing.

6

Monitor Gemini API Token Usage

Track Gemini API spending per model with CostGoat's token-level visibility. Get instant alerts when switching from Flash to Pro models, when context caching savings drop unexpectedly, or when batch processing opportunities are missed.

Gemini Model Selection Guide

Use Case

Customer Support Chat

Recommended Model

Gemini 2.0 Flash-Lite

Fastest & Cheapest

Monthly Cost (Est.)

~$10-50

Why This Model?

Fastest, lowest cost, free tier available

Use Case

Code Generation

Recommended Model

Gemini 2.5 Pro

Best for Coding

Monthly Cost (Est.)

~$80-400

Why This Model?

State-of-the-art coding, 1M context, free tier

Use Case

Research & Analysis

Recommended Model

Gemini 3 Pro Preview

Highest Intelligence

Monthly Cost (Est.)

~$100-500

Why This Model?

Most powerful, best multimodal understanding

Use Case

Content Writing

Recommended Model

Gemini 2.5 Flash

Hybrid Reasoning

Monthly Cost (Est.)

~$30-150

Why This Model?

Good balance of quality and cost, thinking budgets

Use Case

Data Extraction

Recommended Model

Gemini 2.5 Flash-Lite + Batch

Best Value

Monthly Cost (Est.)

~$15-75

Why This Model?

Most cost-effective with batch discount

Use Case

Grounded Search

Recommended Model

Gemini 2.5 Flash

+ Google Search

Monthly Cost (Est.)

~$50-200

Why This Model?

Up-to-date info with search grounding

Google Search Grounding Pricing

Tier

Free Tier (Flash)

Free Quota

500 RPD

Paid Rate

N/A

Notes

Shared limit between Flash & Flash-Lite

Tier

Paid Tier (Flash)

Free Quota

1,500 RPD

Paid Rate

$35 / 1K prompts

Notes

After free quota exceeded

Tier

Pro Models

Free Quota

N/A (Free) / 1,500 RPD (Paid)

Paid Rate

$35 / 1K prompts

Notes

Not available on free tier

Tier

Google Maps Grounding

Free Quota

500-1,500 RPD

Paid Rate

$25 / 1K prompts

Notes

Location-based queries

RPD: Requests Per Day. Each prompt can generate multiple search queries. With dynamic retrieval, only requests containing grounding URLs in the response are charged.

Track Your Google AI Costs in Real-Time

Whether you're using APIs or subscriptions, CostGoat monitors your actual usage and costs as they happen. Privacy-first desktop app with local credential storage. 7-day free trial, then $9/month.

Start Free Trial

Gemini API Pricing FAQ

Common questions about Gemini API costs, billing, and optimization

AWS Calculators

AWS Lambda PricingAWS Cost CalculatorsAmazon Route 53 PricingAWS NAT Gateway PricingAmazon API Gateway PricingAWS Secrets Manager Pricing
DownloadPricingDashboardContactAffiliate ProgramTermsPrivacy

© 2025 CostGoat. All rights reserved.