Inference Pricing

BETA

We're still working on our pricing scraper. Please double check prices before making any decisions on this information while we're still in beta.

Example: AI Assistant Chat (10 msg thread)×(users)

Total tokens: 0
Sample Conversation
Input: 0 tokens × 10
Output: 0 tokens × 10
User: I need to build a simple website for my small business. What's the best approach? User: I sell handmade leather goods like wallets, belts, and ...
5 user messages + 5 AI responses
Input Tokens: 0 × 10 = 0
User messages are typically charged at a lower rate
Output Tokens: 0 × 10 = 0
AI responses are typically charged at a higher rate

Category Filters

Vendor Filters

Context Window

All Models

Showing 43 models

Models

Google: Gemma 3n 2B (free)*

GoogleCode GenerationExperimentalText
Parameters:nullB
Context Window:8,192
Max Output Tokens:2,048
Input ($/1M):$0.000
Output ($/1M):$0.000
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.5 Pro Experimental*

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,535
Input ($/1M):$0.000
Output ($/1M):$0.000
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.0 Flash Experimental (free)*

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:8,192
Input ($/1M):$0.000
Output ($/1M):$0.000
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 0528 (free)*

DeepseekReasoningText
Parameters:nullB
Context Window:163,840
Max Output Tokens:N/A
Input ($/1M):$0.000
Output ($/1M):$0.000
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3 0324 (free)*

DeepseekConversationalExperimentalText
Parameters:nullB
Context Window:32,768
Max Output Tokens:16,384
Input ($/1M):$0.000
Output ($/1M):$0.000
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemma 2 9B

GoogleCode GenerationExperimentalText
Parameters:nullB
Context Window:8,192
Max Output Tokens:8,192
Input ($/1M):$0.004
Output ($/1M):$0.004
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: Deepseek R1 0528 Qwen3 8B

DeepseekCode GenerationText
Parameters:nullB
Context Window:32,000
Max Output Tokens:N/A
Input ($/1M):$0.010
Output ($/1M):$0.020
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemma 3n 4B

GoogleCode GenerationExperimentalText
Parameters:nullB
Context Window:32,768
Max Output Tokens:N/A
Input ($/1M):$0.020
Output ($/1M):$0.040
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemma 3 4B

GoogleMultimodalText
Parameters:nullB
Context Window:131,072
Max Output Tokens:N/A
Input ($/1M):$0.020
Output ($/1M):$0.040
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemma 3 12B

GoogleMultimodalText
Parameters:nullB
Context Window:96,000
Max Output Tokens:8,192
Input ($/1M):$0.030
Output ($/1M):$0.030
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 1.5 Flash 8B

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,000,000
Max Output Tokens:8,192
Input ($/1M):$0.037
Output ($/1M):$0.150
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Llama 8B

DeepseekCode GenerationText
Parameters:nullB
Context Window:32,000
Max Output Tokens:32,000
Input ($/1M):$0.040
Output ($/1M):$0.040
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Llama 70B

DeepseekCode GenerationText
Parameters:nullB
Context Window:131,072
Max Output Tokens:N/A
Input ($/1M):$0.050
Output ($/1M):$0.050
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 1.5 Flash

GoogleMultimodalFrontierText
Parameters:nullB
Context Window:1,000,000
Max Output Tokens:8,192
Input ($/1M):$0.075
Output ($/1M):$0.300
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.0 Flash Lite

GoogleGeneralFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:8,192
Input ($/1M):$0.075
Output ($/1M):$0.300
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 32B

DeepseekCode GenerationFrontierText
Parameters:nullB
Context Window:131,072
Max Output Tokens:16,384
Input ($/1M):$0.075
Output ($/1M):$0.150
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemma 3 27B

GoogleMultimodalExperimentalText
Parameters:nullB
Context Window:131,072
Max Output Tokens:16,384
Input ($/1M):$0.090
Output ($/1M):$0.170
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.5 Flash Lite Preview 06-17

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,535
Input ($/1M):$0.100
Output ($/1M):$0.400
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.5 Flash Lite

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,535
Input ($/1M):$0.100
Output ($/1M):$0.400
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.0 Flash

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:8,192
Input ($/1M):$0.100
Output ($/1M):$0.400
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 7B

DeepseekCode GenerationText
Parameters:nullB
Context Window:131,072
Max Output Tokens:N/A
Input ($/1M):$0.100
Output ($/1M):$0.200
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 14B

DeepseekCode GenerationFrontierText
Parameters:nullB
Context Window:64,000
Max Output Tokens:32,000
Input ($/1M):$0.150
Output ($/1M):$0.150
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 1.5B

DeepseekReasoningFrontierText
Parameters:nullB
Context Window:131,072
Max Output Tokens:32,768
Input ($/1M):$0.180
Output ($/1M):$0.180
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3 0324

DeepseekConversationalExperimentalText
Parameters:nullB
Context Window:163,840
Max Output Tokens:163,840
Input ($/1M):$0.250
Output ($/1M):$0.850
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Anthropic: Claude 3 Haiku

AnthropicMultimodalExperimentalText
Parameters:nullB
Context Window:200,000
Max Output Tokens:4,096
Input ($/1M):$0.250
Output ($/1M):$1.250
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1 0528

DeepseekReasoningText
Parameters:nullB
Context Window:163,840
Max Output Tokens:N/A
Input ($/1M):$0.272
Output ($/1M):$0.272
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3

DeepseekCode GenerationFrontierText
Parameters:nullB
Context Window:163,840
Max Output Tokens:N/A
Input ($/1M):$0.272
Output ($/1M):$0.272
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemini 2.5 Flash

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,535
Input ($/1M):$0.300
Output ($/1M):$2.500
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3 Base

DeepseekCode GenerationFrontierText
Parameters:nullB
Context Window:163,840
Max Output Tokens:N/A
Input ($/1M):$0.302
Output ($/1M):$0.302
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: R1

DeepseekReasoningText
Parameters:nullB
Context Window:163,840
Max Output Tokens:163,840
Input ($/1M):$0.400
Output ($/1M):$2.000
Example Cost:
$0.000
In: $0.000 | Out: $0.000

DeepSeek: DeepSeek Prover V2

DeepseekReasoningText
Parameters:nullB
Context Window:163,840
Max Output Tokens:N/A
Input ($/1M):$0.500
Output ($/1M):$2.180
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Google: Gemma 2 27B

GoogleCode GenerationExperimentalText
Parameters:nullB
Context Window:8,192
Max Output Tokens:N/A
Input ($/1M):$0.650
Output ($/1M):$0.650
Example Cost:
$0.000
In: $0.000 | Out: $0.000

Anthropic: Claude 3.5 Haiku

AnthropicCode GenerationExperimentalText
Parameters:nullB
Context Window:200,000
Max Output Tokens:8,192
Input ($/1M):$0.800
Output ($/1M):$4.000
Example Cost:
$0.001
In: $0.000 | Out: $0.001

Google: Gemini 1.5 Pro

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:2,000,000
Max Output Tokens:8,192
Input ($/1M):$1.250
Output ($/1M):$5.000
Example Cost:
$0.001
In: $0.000 | Out: $0.001

Google: Gemini 2.5 Pro Preview 05-06

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,535
Input ($/1M):$1.250
Output ($/1M):$10.000
Example Cost:
$0.002
In: $0.000 | Out: $0.002

Google: Gemini 2.5 Pro Preview 06-05

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,536
Input ($/1M):$1.250
Output ($/1M):$10.000
Example Cost:
$0.002
In: $0.000 | Out: $0.002

Google: Gemini 2.5 Pro

GoogleCode GenerationFrontierText
Parameters:nullB
Context Window:1,048,576
Max Output Tokens:65,536
Input ($/1M):$1.250
Output ($/1M):$10.000
Example Cost:
$0.002
In: $0.000 | Out: $0.002

Anthropic: Claude Sonnet 4

AnthropicCode GenerationExperimentalText
Parameters:nullB
Context Window:200,000
Max Output Tokens:64,000
Input ($/1M):$3.000
Output ($/1M):$15.000
Example Cost:
$0.003
In: $0.000 | Out: $0.003

Anthropic: Claude 3.7 Sonnet

AnthropicCode GenerationExperimentalText
Parameters:nullB
Context Window:200,000
Max Output Tokens:64,000
Input ($/1M):$3.000
Output ($/1M):$15.000
Example Cost:
$0.003
In: $0.000 | Out: $0.003

Anthropic: Claude 3.5 Sonnet

AnthropicCode GenerationFrontierText
Parameters:nullB
Context Window:200,000
Max Output Tokens:8,192
Input ($/1M):$3.000
Output ($/1M):$15.000
Example Cost:
$0.003
In: $0.000 | Out: $0.003

Anthropic: Claude 3 Sonnet

AnthropicMultimodalProductionText
Parameters:nullB
Context Window:200,000
Max Output Tokens:4,096
Input ($/1M):$3.000
Output ($/1M):$15.000
Example Cost:
$0.003
In: $0.000 | Out: $0.003

Anthropic: Claude Opus 4

AnthropicCode GenerationProductionText
Parameters:nullB
Context Window:200,000
Max Output Tokens:32,000
Input ($/1M):$15.000
Output ($/1M):$75.000
Example Cost:
$0.015
In: $0.002 | Out: $0.013

Anthropic: Claude 3 Opus

AnthropicMultimodalFrontierText
Parameters:nullB
Context Window:200,000
Max Output Tokens:4,096
Input ($/1M):$15.000
Output ($/1M):$75.000
Example Cost:
$0.015
In: $0.002 | Out: $0.013

Pricing shown per 1M tokens. Example costs are estimates only and may vary based on actual tokenization.

Last updated: 9/13/2025

Download all available models and pricing data instantly.

Navigate the complex landscape of AI models with ease. This tool helps developers find the right model for their projects by comparing pricing across providers. We are also working on improving our model categorizations. If you have suggestions for how we can improve our grouping or tags/categories, please create an issue in our GitHub repo.