OpenAI Embeddings Pricing Calculator & Cost Guide
Calculate OpenAI Embeddings API costs for semantic search, RAG, and document indexing. Compare all 3 embedding models with batch discounts.
Pricing TLDR
- • $5 free credits for new users (can embed ~250M tokens with cheapest model)
- • Pay-per-token: text-embedding-3-small ($0.02) • ada-002 ($0.10) • 3-large ($0.13) per 1M tokens
- • Batch API 50% off • No output tokens (only input charged)
Official pricing:
OpenAIOpenAI Embeddings Cost Calculator
Pricing Tier
Calculate by
Number of Documents
Avg Words per Document
≈ 798 tokens (1 word ≈ 1.33 tokens)
Quick Examples:
Text Embedding 3 Small (text-embedding-3-small)
Dimensions
Performance
Per 1M Tokens
$0.02
Total Cost
Text Embedding 3 Large (text-embedding-3-large)
Dimensions
Performance
Per 1M Tokens
$0.13
Total Cost
Text Embedding Ada 002 (text-embedding-ada-002)
Dimensions
Performance
Per 1M Tokens
$0.10
Total Cost
Note: Embeddings only charge for input tokens. Unlike chat models, there are no output tokens. Costs shown are for one-time indexing. If you need to re-index documents, multiply by frequency (e.g., monthly updates = 12× annual cost).
About OpenAI Embeddings
What is OpenAI Embeddings?
OpenAI Embeddings API converts text into high-dimensional vector representations (1536 or 3072 numbers) that capture semantic meaning. These embeddings enable applications to understand text similarity, search by meaning rather than keywords, and build context-aware AI systems. The API offers three models: text-embedding-3-small (most cost-effective), text-embedding-3-large (highest quality), and text-embedding-ada-002 (legacy). Embeddings are the foundation for RAG (Retrieval Augmented Generation), semantic search, recommendation systems, and clustering.
- Three Models for Different Needs: text-embedding-3-small: $0.02/$0.01 per 1M tokens, 1536 dimensions, excellent cost-to-performance ratio for most use cases. text-embedding-3-large: $0.13/$0.065 per 1M tokens, 3072 dimensions, highest quality for complex semantic tasks. text-embedding-ada-002: $0.10/$0.05 per 1M tokens, legacy model (use 3-small instead).
- Batch API for 50% Savings: Process embeddings within 24 hours at half price using Batch API. Perfect for indexing large document stores, one-time migrations, or periodic updates. Standard tier for real-time embedding generation, Batch tier for background processing.
- Common Use Cases: Semantic search: Find similar documents by meaning. RAG systems: Retrieve relevant context for LLM prompts. Recommendations: Suggest similar products/content. Clustering: Group similar documents. Classification: Categorize text by semantic similarity. Anomaly detection: Identify outliers in text data.
When to Use OpenAI Embeddings
Use text-embedding-3-small for 95% of use cases - it offers the best cost-to-performance ratio. Only upgrade to text-embedding-3-large when semantic quality is critical (legal documents, research, complex domain-specific content). Use Batch API whenever real-time processing isn't required to save 50%.
Ideal for
- RAG systems with text-embedding-3-small + Batch API for initial indexing
- Semantic search across documentation, knowledge bases, or content libraries
- Product recommendations based on description similarity
- Customer support ticket routing and similar issue detection
- Code search and duplicate code detection
Not ideal for
- Image embeddings (use multimodal models like GPT-4o instead)
- Real-time embedding generation at scale (consider caching strategies)
- Multilingual use cases requiring language-specific models
- Tasks requiring fine-tuned domain-specific embeddings
OpenAI Embeddings Pricing Breakdown
Free Tier
New OpenAI API users receive $5 in free credits (no credit card required) that can be used for embeddings. These credits can generate approximately 250M tokens using text-embedding-3-small, enough to embed ~500,000 documents at 500 tokens each.
- Sign up at platform.openai.com - no credit card required
- Receive $5 free credits instantly upon registration
- Credits expire after 3 months from grant date
- Works across all embedding models
- Sufficient for testing and small-scale projects
Model Comparison
text-embedding-3-small (Recommended)
Best cost-to-performance ratio. $0.02 per 1M tokens (Standard) or $0.01 (Batch). 1536 dimensions. Excellent quality for semantic search, RAG, and most production use cases. Example: embedding 10,000 documents (500 tokens each = 5M tokens) costs $0.10 Standard or $0.05 Batch.
text-embedding-3-large (Premium)
Highest quality embeddings. $0.13 per 1M tokens (Standard) or $0.065 (Batch). 3072 dimensions for maximum semantic precision. Use only when quality is critical (legal, research, complex domains). 6.5x more expensive than 3-small with marginal quality improvement for most tasks.
text-embedding-ada-002 (Legacy)
Legacy model. $0.10 per 1M tokens (Standard) or $0.05 (Batch). 1536 dimensions. Replaced by text-embedding-3-small which offers better performance at 5x lower cost. Only use if you need backward compatibility with existing embeddings.
Pricing Tiers
Standard Tier (Real-time)
Default tier for immediate embedding generation. All models available. Use for real-time semantic search, live RAG queries, and interactive applications. No volume commitments or delays.
Batch Tier (50% Discount)
Process embeddings within 24 hours at half price. Perfect for initial document indexing, periodic re-indexing, or migration projects. Example: indexing 100,000 docs (50M tokens) with text-embedding-3-small costs $1.00 Standard vs $0.50 Batch.
Technical Specifications
Token Limits & Performance
Max input: 8,191 tokens per API call. Dimensions: 1536 (small/ada) or 3072 (large). Shortening dimensions supported via API parameter. Normalized vectors (cosine similarity ready). Processing speed: ~1000 embeddings/minute on Standard tier.
Cost Components
Only input tokens are charged (no output cost). One-time indexing cost + optional re-indexing costs (if data changes). Storage costs NOT included (use your own vector database). No minimum spend or subscription fees - pure pay-per-token.
Integration & Storage
Works with popular vector databases: Pinecone, Weaviate, Qdrant, Milvus, pgvector, ChromaDB. Use with LangChain, LlamaIndex for RAG. API supports batching up to 2048 embeddings per request for efficiency.
7 OpenAI Embeddings Cost Optimization Tips
Always Use text-embedding-3-small Unless You Have a Specific Reason Not To
text-embedding-3-small costs $0.02/$0.01 per 1M tokens vs text-embedding-3-large at $0.13/$0.065 (6.5x difference). For 95% of use cases, the quality difference doesn't justify the cost. Start with 3-small and only upgrade to 3-large if your retrieval metrics prove you need it. Avoid text-embedding-ada-002 entirely - it costs 5x more than 3-small with worse performance.
Use Batch API for Initial Indexing (50% Savings)
Process large document collections with Batch API to save 50%. Example: indexing 100,000 documents (50M tokens) costs $1.00 Standard vs $0.50 Batch with text-embedding-3-small. Perfect for one-time migrations, periodic re-indexing, or non-urgent updates. Only use Standard tier for real-time embedding generation.
Optimize Token Usage with Chunking Strategy
Don't embed entire documents - chunk into 300-500 token segments for better retrieval and lower costs. Example: a 3000-token document as 1 embedding = 3000 tokens. Split into 6×500-token chunks = same 3000 tokens but better search precision. Use overlap (50-100 tokens) between chunks for context continuity without significant cost increase.
Cache Embeddings to Avoid Re-Generation
Store embeddings in vector database to avoid regenerating for the same content. Use content hashing to detect duplicates before API calls. Implement delta indexing to only embed new/changed documents. For static content, one-time embedding generation = zero ongoing costs.
Reduce Dimensions for Lower Storage Costs
text-embedding-3-large supports dimension reduction via API (e.g., 3072→1536) without re-training. Lower dimensions = smaller vector database storage costs and faster similarity search. API call cost stays the same, but you save on storage and compute. Test if reduced dimensions maintain your quality requirements.
Batch API Requests for Better Throughput
API supports batching up to 2048 embeddings per request. Reduces API overhead and improves processing speed. Example: embedding 10,000 docs as 5 batch requests (2000 each) vs 10,000 individual calls. Same cost, but much faster and fewer rate limit issues.
Monitor Embeddings Costs in Real-Time
CostGoat tracks embedding generation costs per model in real-time. Get alerts when switching from 3-small to 3-large unexpectedly. Identify opportunities to migrate batch workloads from Standard to Batch tier (50% savings). Visualize token usage patterns to optimize chunking strategy.
OpenAI Embeddings Model Selection Guide
Use Case
RAG System
Recommended Model
text-embedding-3-small
1536 dims, Standard
Est. Cost (10K docs)
~$0.10
Why This Model?
Best cost-to-performance for semantic retrieval
Use Case
Semantic Search
Recommended Model
text-embedding-3-small
1536 dims, Batch
Est. Cost (10K docs)
~$0.05
Why This Model?
Batch tier for initial indexing (50% savings)
Use Case
Legal Documents
Recommended Model
text-embedding-3-large
3072 dims, Standard
Est. Cost (10K docs)
~$0.65
Why This Model?
Highest quality for critical semantic precision
Use Case
Product Recommendations
Recommended Model
text-embedding-3-small
1536 dims, Batch
Est. Cost (10K docs)
~$0.05
Why This Model?
Batch processing for catalog indexing
Use Case
Code Search
Recommended Model
text-embedding-3-small
1536 dims, Standard
Est. Cost (10K docs)
~$0.10
Why This Model?
Cost-effective for code snippet similarity
Use Case
Research Papers
Recommended Model
text-embedding-3-large
3072 dims, Batch
Est. Cost (10K docs)
~$0.33
Why This Model?
High quality needed, batch discount for archives
Track Your OpenAI API Costs in Real-Time
Monitor your OpenAI API usage and spending across all models - GPT, DALL-E, Whisper, and more. CostGoat runs on your desktop with privacy-first local monitoring. 7-day free trial, then $9/month.
Start Free TrialOpenAI Embeddings Pricing FAQ
Common questions about OpenAI embeddings costs, models, and optimization
