Gemini API Pricing Calculator & Complete Cost Guide
Calculate Gemini API costs for text and chat models per token and per month. Compare 9 models with batch & cache discounts.
Pricing TLDR
- • Generous free tiers for most models (rate-limited)
- • Pay-per-token: Flash-Lite ($0.10/$0.40) - 3 Flash ($0.50/$3) - 2.5 Pro ($1.25/$10) - 3 Pro ($2/$12) per M tokens
- • 50% batch discount - 90% context caching savings - Audio input ~2-3x text pricing
Gemini API Cost Calculator - Monthly Pricing
Calculate by
Input Tokens
Output Tokens
API Calls / Month
Quick Examples:
Cost Optimization:
Gemini 3 Pro Preview (gemini-3-pro-preview)
Context
Quality
Per 1M Tokens
In: $2.00
Out: $12.00
Monthly Cost
Gemini 3 Flash Preview (gemini-3-flash-preview)
Context
Quality
Per 1M Tokens
In: $0.50
Out: $3.00
Monthly Cost
Gemini 2.5 Pro (gemini-2.5-pro)
Context
Quality
Per 1M Tokens
In: $1.25
Out: $10.00
Monthly Cost
Gemini 2.5 Flash Preview (gemini-2.5-flash-preview-09-2025)
Context
Quality
Per 1M Tokens
In: $0.30
Out: $2.50
Monthly Cost
Gemini 2.5 Flash (gemini-2.5-flash)
Context
Quality
Per 1M Tokens
In: $0.30
Out: $2.50
Monthly Cost
Gemini 2.5 Flash-Lite Preview (gemini-2.5-flash-lite-preview-09-2025)
Context
Quality
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.0 Flash (gemini-2.0-flash)
Context
Quality
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.5 Flash-Lite (gemini-2.5-flash-lite)
Context
Quality
Per 1M Tokens
In: $0.10
Out: $0.40
Monthly Cost
Gemini 2.0 Flash-Lite (gemini-2.0-flash-lite)
Context
Quality
Per 1M Tokens
In: $0.08
Out: $0.30
Monthly Cost

Tired of manually checking your API credits?
Monitor your credit balance and spending in real-time. Get alerts before you run out.
Privacy-first desktop app. No sign-up required.
About Gemini API
What is Gemini API?
The Gemini API provides programmatic access to Google's family of multimodal AI models, designed for a wide range of tasks from simple chat to complex reasoning. Gemini models (Pro, Flash, Flash-Lite) offer different capability and price tiers, with generous free tiers and competitive paid pricing.
- Multiple Model Tiers: Choose from Gemini 3 Pro Preview (most powerful), Gemini 3 Flash Preview (frontier + speed), Gemini 2.5 Pro (best for coding), Gemini 2.5 Flash (hybrid reasoning), Gemini 2.5 Flash-Lite (cost-effective), and Gemini 2.0 Flash/Flash-Lite (balanced/fastest) to match your needs.
- Massive Context Windows: All Gemini models support 1M token context windows (~750K words). Pro models have tiered pricing: standard rates for ≤200K tokens, 2x rates for >200K tokens.
- Token-Based Pricing: Pay only for what you use with separate pricing for input and output tokens. Output typically costs 3-10x more than input depending on model. Generous free tiers available for most models.
When to Use Gemini API
Start with Gemini 2.0 Flash-Lite or 2.5 Flash-Lite for simple tasks and cost-effectiveness, then upgrade to Pro models as complexity increases. Use batch processing and context caching to significantly reduce costs for production workloads.
Ideal for
- Customer support chatbots with natural conversation flow
- Code completion and review in development environments
- Large document summarization and analysis (1M context)
- Creative content generation for marketing and blogs
- Grounded responses using Google Search integration
Not ideal for
- Real-time applications requiring <100ms latency
- Simple pattern matching tasks (use cheaper alternatives)
- Applications needing guaranteed deterministic outputs
- Tasks requiring specialized domain knowledge without context
Gemini API Pricing Breakdown
Free Tier
Most Gemini models offer generous free tiers with rate limits. Free tier usage is subject to Google using your data to improve products. No credit card required to get started.
- Gemini 3 Flash Preview, 2.5 Pro, 2.5 Flash, 2.5 Flash-Lite: Free with rate limits
- Gemini 2.0 Flash, 2.0 Flash-Lite: Free with rate limits
- Google Search grounding: 500 RPD free (Flash models)
- Gemini 3 Pro Preview: Paid tier only (no free access)
Cost Optimization Features
Batch API (50% Discount)
Process non-urgent workloads asynchronously at half price. Example: Gemini 2.5 Pro drops to $0.625/$5 per million tokens. Perfect for data processing and content generation. Not available on free tier.
Context Caching (90% Savings)
Cache frequently used prompts, system messages, or documents. Cache reads cost 10% of base input price. Storage costs $1-4.50 per million tokens per hour depending on model.
Grounding with Google Search
Get up-to-date information by grounding responses with Google Search. 500-1,500 RPD free depending on tier. Gemini 3 models: $14/1K queries. Gemini 2.x models: $35/1K prompts. Google Maps grounding: $25/1K prompts.
Tiered Context Pricing
Pro models (2.5 Pro, 3 Pro) have standard pricing for prompts ≤200K tokens and 2x pricing for prompts >200K tokens. Flash models have flat pricing regardless of context length.
Audio Input Pricing Premiums
Audio input costs 2-7x more than text input depending on the model. Gemini 3 Flash: 2x ($1.00 vs $0.50). Gemini 2.5 Flash: 3.33x ($1.00 vs $0.30). Gemini 2.5 Flash-Lite: 3x ($0.30 vs $0.10). Gemini 2.0 Flash: 7x ($0.70 vs $0.10). Audio output pricing remains the same as text output.
Gemini API Monthly Cost Estimates
Light Use
$0-30/mo
• Personal projects
• <1K requests/day
• Flash-Lite free tier
Medium Use
$30-150/mo
• Small apps
• 1-5K requests/day
• Flash or Flash-Lite
Heavy Use
$150-800/mo
• Production apps
• 5-20K requests/day
• Mixed models
Enterprise
$800+/mo
• Large scale
• 20K+ requests/day
• Vertex AI available
6 Gemini API Cost Optimization Tips
Use Context Caching
Save 90% on repeated content by caching frequently used prompts, system messages, or documents. Cache reads cost just 10% of base input price. Storage costs $1-4.50/M tokens/hour. Example: 10K requests with 80% cache hits saves significant costs on Pro models.
Enable Batch API
Get 50% discount on all paid models by processing non-urgent workloads asynchronously. Gemini 2.5 Pro drops to $0.625/$5 per M tokens (vs $1.25/$10 standard). Perfect for data processing, content generation, and analysis tasks.
Start with Flash-Lite
Use the cheapest models (Gemini 2.0 Flash-Lite at $0.075/$0.30 or 2.5 Flash-Lite at $0.10/$0.40 per M tokens) for classification and routing. Only escalate to expensive Pro models when necessary for complex tasks.
Leverage Free Tiers
Most Gemini models have generous free tiers with rate limits. Use free tier for development and testing, then upgrade to paid for production. Free tier data may be used to improve Google products.
Optimize Context Length
Pro models charge 2x for prompts >200K tokens. Keep prompts under 200K when possible, or use Flash models which have flat pricing regardless of context length for long document processing.
Monitor Gemini API Token Usage
Track Gemini API spending per model with CostGoat's token-level visibility. Get instant alerts when switching from Flash to Pro models, when context caching savings drop unexpectedly, or when batch processing opportunities are missed.
Gemini Model Selection Guide
Use Case
Customer Support Chat
Recommended Model
Gemini 2.0 Flash-Lite
Fastest & Cheapest
Monthly Cost (Est.)
~$10-50
Why This Model?
Fastest, lowest cost, free tier available
Use Case
Code Generation
Recommended Model
Gemini 2.5 Pro
Best for Coding
Monthly Cost (Est.)
~$80-400
Why This Model?
State-of-the-art coding, 1M context, free tier
Use Case
Research & Analysis
Recommended Model
Gemini 3 Pro Preview
Highest Quality
Monthly Cost (Est.)
~$100-500
Why This Model?
Most powerful, best multimodal understanding
Use Case
Content Writing
Recommended Model
Gemini 2.5 Flash
Hybrid Reasoning
Monthly Cost (Est.)
~$30-150
Why This Model?
Good balance of quality and cost, thinking budgets
Use Case
Data Extraction
Recommended Model
Gemini 2.5 Flash-Lite + Batch
Best Value
Monthly Cost (Est.)
~$15-75
Why This Model?
Most cost-effective with batch discount
Use Case
Grounded Search
Recommended Model
Gemini 2.5 Flash
+ Google Search
Monthly Cost (Est.)
~$50-200
Why This Model?
Up-to-date info with search grounding
Google Search Grounding Pricing
Gemini 3 Pro/Flash
Free Quota
Paid Rate
Notes
Coming soon for Gemini 3 models
Gemini 2.5 Pro
Free Quota
Paid Rate
Notes
Not available on free tier
Gemini 2.5 Flash/Lite
Free Quota
Paid Rate
Notes
Shared limit between Flash & Flash-Lite
Gemini 2.0 Flash
Free Quota
Paid Rate
Notes
After free quota exceeded
Google Maps Grounding
Free Quota
Paid Rate
Notes
Location-based queries
RPD: Requests Per Day. Each prompt can generate multiple search queries. Gemini 3 models use per-query pricing while Gemini 2.x models use per-prompt pricing.

Track Your Google AI Costs in Real-Time
Monitor your API credit balance and spending as you use it. See exactly how much you're burning through and when credits need topping up.
Privacy-first desktop app. 7-day free trial, no sign-up required.
Gemini API Pricing FAQ
Common questions about Gemini API costs, billing, and optimization
