Gemini API Pricing Calculator & Complete Cost Guide
Calculate Gemini API costs for text and chat models per token and per month. Compare 6 models with batch & cache discounts.
Pricing TLDR
- • Generous free tiers for most models (rate-limited)
- • Pay-per-token: Flash-Lite ($0.10/$0.40) - Pro ($1.25/$10) - 3 Pro ($2/$12) per million tokens
- • 50% batch discount - 90% context caching savings - Long context 2x pricing
Official pricing:
Google AIGemini API Cost Calculator - Monthly Pricing
Calculate by
Input Tokens
Output Tokens
API Calls / Month
Quick Examples:
Cost Optimization:
Gemini 3 Pro Preview (gemini-3-pro-preview)
Context
Intelligence
Per 1M Tokens
In: $2.00
Out: $12.00
Monthly Cost
Gemini 2.5 Pro (gemini-2.5-pro)
Context
Intelligence
Per 1M Tokens
In: $1.25
Out: $10.00
Monthly Cost
Gemini 2.5 Flash (gemini-2.5-flash)
Context
Intelligence
Per 1M Tokens
In: $0.30
Out: $2.50
Monthly Cost
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Context
Intelligence
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.0 Flash (gemini-2.0-flash)
Context
Intelligence
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Context
Intelligence
Per 1M Tokens
In: $0.08
Out: $0.30
Monthly Cost
About Gemini API
What is Gemini API?
The Gemini API provides programmatic access to Google's family of multimodal AI models, designed for a wide range of tasks from simple chat to complex reasoning. Gemini models (Pro, Flash, Flash-Lite) offer different capability and price tiers, with generous free tiers and competitive paid pricing.
- Multiple Model Tiers: Choose from Gemini 3 Pro Preview (most powerful), Gemini 2.5 Pro (best for coding), Gemini 2.5 Flash (hybrid reasoning), Gemini 2.5 Flash-Lite (cost-effective), and Gemini 2.0 Flash/Flash-Lite (balanced/fastest) to match your needs.
- Massive Context Windows: All Gemini models support 1M token context windows (~750K words). Pro models have tiered pricing: standard rates for ≤200K tokens, 2x rates for >200K tokens.
- Token-Based Pricing: Pay only for what you use with separate pricing for input and output tokens. Output typically costs 3-10x more than input depending on model. Generous free tiers available for most models.
When to Use Gemini API
Start with Gemini 2.0 Flash-Lite or 2.5 Flash-Lite for simple tasks and cost-effectiveness, then upgrade to Pro models as complexity increases. Use batch processing and context caching to significantly reduce costs for production workloads.
Ideal for
- Customer support chatbots with natural conversation flow
- Code completion and review in development environments
- Large document summarization and analysis (1M context)
- Creative content generation for marketing and blogs
- Grounded responses using Google Search integration
Not ideal for
- Real-time applications requiring <100ms latency
- Simple pattern matching tasks (use cheaper alternatives)
- Applications needing guaranteed deterministic outputs
- Tasks requiring specialized domain knowledge without context
Gemini API Pricing Breakdown
Free Tier
Most Gemini models offer generous free tiers with rate limits. Free tier usage is subject to Google using your data to improve products. No credit card required to get started.
- Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite: Free with rate limits
- Gemini 2.0 Flash, 2.0 Flash-Lite: Free with rate limits
- Google Search grounding: 500 RPD free (Flash models)
- Gemini 3 Pro Preview: Paid tier only (no free access)
Cost Optimization Features
Batch API (50% Discount)
Process non-urgent workloads asynchronously at half price. Example: Gemini 2.5 Pro drops to $0.625/$5 per million tokens. Perfect for data processing and content generation. Not available on free tier.
Context Caching (90% Savings)
Cache frequently used prompts, system messages, or documents. Cache reads cost 10% of base input price. Storage costs $1-4.50 per million tokens per hour depending on model.
Grounding with Google Search
Get up-to-date information by grounding responses with Google Search. 500-1,500 RPD free depending on tier, then $35 per 1,000 grounded prompts. Google Maps grounding also available.
Tiered Context Pricing
Pro models (2.5 Pro, 3 Pro) have standard pricing for prompts ≤200K tokens and 2x pricing for prompts >200K tokens. Flash models have flat pricing regardless of context length.
Gemini API Monthly Cost Estimates
Light Use
$0-30/mo
• Personal projects
• <1K requests/day
• Flash-Lite free tier
Medium Use
$30-150/mo
• Small apps
• 1-5K requests/day
• Flash or Flash-Lite
Heavy Use
$150-800/mo
• Production apps
• 5-20K requests/day
• Mixed models
Enterprise
$800+/mo
• Large scale
• 20K+ requests/day
• Vertex AI available
6 Gemini API Cost Optimization Tips
Use Context Caching
Save 90% on repeated content by caching frequently used prompts, system messages, or documents. Cache reads cost just 10% of base input price. Storage costs $1-4.50/M tokens/hour. Example: 10K requests with 80% cache hits saves significant costs on Pro models.
Enable Batch API
Get 50% discount on all paid models by processing non-urgent workloads asynchronously. Gemini 2.5 Pro drops to $0.625/$5 per M tokens (vs $1.25/$10 standard). Perfect for data processing, content generation, and analysis tasks.
Start with Flash-Lite
Use the cheapest models (Gemini 2.0 Flash-Lite at $0.075/$0.30 or 2.5 Flash-Lite at $0.10/$0.40 per M tokens) for classification and routing. Only escalate to expensive Pro models when necessary for complex tasks.
Leverage Free Tiers
Most Gemini models have generous free tiers with rate limits. Use free tier for development and testing, then upgrade to paid for production. Free tier data may be used to improve Google products.
Optimize Context Length
Pro models charge 2x for prompts >200K tokens. Keep prompts under 200K when possible, or use Flash models which have flat pricing regardless of context length for long document processing.
Monitor Gemini API Token Usage
Track Gemini API spending per model with CostGoat's token-level visibility. Get instant alerts when switching from Flash to Pro models, when context caching savings drop unexpectedly, or when batch processing opportunities are missed.
Gemini Model Selection Guide
Use Case
Customer Support Chat
Recommended Model
Gemini 2.0 Flash-Lite
Fastest & Cheapest
Monthly Cost (Est.)
~$10-50
Why This Model?
Fastest, lowest cost, free tier available
Use Case
Code Generation
Recommended Model
Gemini 2.5 Pro
Best for Coding
Monthly Cost (Est.)
~$80-400
Why This Model?
State-of-the-art coding, 1M context, free tier
Use Case
Research & Analysis
Recommended Model
Gemini 3 Pro Preview
Highest Intelligence
Monthly Cost (Est.)
~$100-500
Why This Model?
Most powerful, best multimodal understanding
Use Case
Content Writing
Recommended Model
Gemini 2.5 Flash
Hybrid Reasoning
Monthly Cost (Est.)
~$30-150
Why This Model?
Good balance of quality and cost, thinking budgets
Use Case
Data Extraction
Recommended Model
Gemini 2.5 Flash-Lite + Batch
Best Value
Monthly Cost (Est.)
~$15-75
Why This Model?
Most cost-effective with batch discount
Use Case
Grounded Search
Recommended Model
Gemini 2.5 Flash
+ Google Search
Monthly Cost (Est.)
~$50-200
Why This Model?
Up-to-date info with search grounding
Google Search Grounding Pricing
Tier
Free Quota
Paid Rate
Notes
Shared limit between Flash & Flash-Lite
Tier
Free Quota
Paid Rate
Notes
After free quota exceeded
Tier
Free Quota
Paid Rate
Notes
Not available on free tier
Tier
Free Quota
Paid Rate
Notes
Location-based queries
RPD: Requests Per Day. Each prompt can generate multiple search queries. With dynamic retrieval, only requests containing grounding URLs in the response are charged.
Track Your Google AI Costs in Real-Time
Whether you're using APIs or subscriptions, CostGoat monitors your actual usage and costs as they happen. Privacy-first desktop app with local credential storage. 7-day free trial, then $9/month.
Start Free TrialGemini API Pricing FAQ
Common questions about Gemini API costs, billing, and optimization
