Inference Pricing

BETA

We're still working on our pricing scraper. Please double check prices before making any decisions on this information while we're still in beta.

Chat AssistantData Categorization

Example: AI Assistant Chat (10 msg thread)×(users)

Total tokens: 0

Sample Conversation

Input: 0 tokens × 10

Output: 0 tokens × 10

User: I need to build a simple website for my small business. What's the best approach? User: I sell handmade leather goods like wallets, belts, and ...

5 user messages + 5 AI responses

Input Tokens: 0 × 10 = 0

User messages are typically charged at a lower rate

Output Tokens: 0 × 10 = 0

AI responses are typically charged at a higher rate

Category Filters

Vendor Filters

Context Window

All Models

Showing 43 models

Model Name	Provider	Context Window Maximum context length in tokens	Input ($/1M) Price per 1 million tokens (industry standard)	Output ($/1M) Price per 1 million tokens (industry standard)	Example Cost
Google: Gemma 3n 2B (free)* ExperimentalText	Google	8,192	$0.000	$0.000	$0.000
Google: Gemini 2.5 Pro Experimental* FrontierText	Google	1,048,576	$0.000	$0.000	$0.000
Google: Gemini 2.0 Flash Experimental (free)* FrontierText	Google	1,048,576	$0.000	$0.000	$0.000
DeepSeek: R1 0528 (free)* Text	Deepseek	163,840	$0.000	$0.000	$0.000
DeepSeek: DeepSeek V3 0324 (free)* ExperimentalText	Deepseek	32,768	$0.000	$0.000	$0.000
Google: Gemma 2 9B ExperimentalText	Google	8,192	$0.004	$0.004	$0.000
DeepSeek: Deepseek R1 0528 Qwen3 8B Text	Deepseek	32,000	$0.010	$0.020	$0.000
Google: Gemma 3n 4B ExperimentalText	Google	32,768	$0.020	$0.040	$0.000
Google: Gemma 3 4B Text	Google	131,072	$0.020	$0.040	$0.000
Google: Gemma 3 12B Text	Google	96,000	$0.030	$0.030	$0.000
Google: Gemini 1.5 Flash 8B FrontierText	Google	1,000,000	$0.037	$0.150	$0.000
DeepSeek: R1 Distill Llama 8B Text	Deepseek	32,000	$0.040	$0.040	$0.000
DeepSeek: R1 Distill Llama 70B Text	Deepseek	131,072	$0.050	$0.050	$0.000
Google: Gemini 1.5 Flash FrontierText	Google	1,000,000	$0.075	$0.300	$0.000
Google: Gemini 2.0 Flash Lite FrontierText	Google	1,048,576	$0.075	$0.300	$0.000
DeepSeek: R1 Distill Qwen 32B FrontierText	Deepseek	131,072	$0.075	$0.150	$0.000
Google: Gemma 3 27B ExperimentalText	Google	131,072	$0.090	$0.170	$0.000
Google: Gemini 2.5 Flash Lite Preview 06-17 FrontierText	Google	1,048,576	$0.100	$0.400	$0.000
Google: Gemini 2.5 Flash Lite FrontierText	Google	1,048,576	$0.100	$0.400	$0.000
Google: Gemini 2.0 Flash FrontierText	Google	1,048,576	$0.100	$0.400	$0.000
DeepSeek: R1 Distill Qwen 7B Text	Deepseek	131,072	$0.100	$0.200	$0.000
DeepSeek: R1 Distill Qwen 14B FrontierText	Deepseek	64,000	$0.150	$0.150	$0.000
DeepSeek: R1 Distill Qwen 1.5B FrontierText	Deepseek	131,072	$0.180	$0.180	$0.000
DeepSeek: DeepSeek V3 0324 ExperimentalText	Deepseek	163,840	$0.250	$0.850	$0.000
Anthropic: Claude 3 Haiku ExperimentalText	Anthropic	200,000	$0.250	$1.250	$0.000
DeepSeek: R1 0528 Text	Deepseek	163,840	$0.272	$0.272	$0.000
DeepSeek: DeepSeek V3 FrontierText	Deepseek	163,840	$0.272	$0.272	$0.000
Google: Gemini 2.5 Flash FrontierText	Google	1,048,576	$0.300	$2.500	$0.000
DeepSeek: DeepSeek V3 Base FrontierText	Deepseek	163,840	$0.302	$0.302	$0.000
DeepSeek: R1 Text	Deepseek	163,840	$0.400	$2.000	$0.000
DeepSeek: DeepSeek Prover V2 Text	Deepseek	163,840	$0.500	$2.180	$0.000
Google: Gemma 2 27B ExperimentalText	Google	8,192	$0.650	$0.650	$0.000
Anthropic: Claude 3.5 Haiku ExperimentalText	Anthropic	200,000	$0.800	$4.000	$0.001
Google: Gemini 1.5 Pro FrontierText	Google	2,000,000	$1.250	$5.000	$0.001
Google: Gemini 2.5 Pro Preview 05-06 FrontierText	Google	1,048,576	$1.250	$10.000	$0.002
Google: Gemini 2.5 Pro Preview 06-05 FrontierText	Google	1,048,576	$1.250	$10.000	$0.002
Google: Gemini 2.5 Pro FrontierText	Google	1,048,576	$1.250	$10.000	$0.002
Anthropic: Claude Sonnet 4 ExperimentalText	Anthropic	200,000	$3.000	$15.000	$0.003
Anthropic: Claude 3.7 Sonnet ExperimentalText	Anthropic	200,000	$3.000	$15.000	$0.003
Anthropic: Claude 3.5 Sonnet FrontierText	Anthropic	200,000	$3.000	$15.000	$0.003
Anthropic: Claude 3 Sonnet ProductionText	Anthropic	200,000	$3.000	$15.000	$0.003
Anthropic: Claude Opus 4 ProductionText	Anthropic	200,000	$15.000	$75.000	$0.015
Anthropic: Claude 3 Opus FrontierText	Anthropic	200,000	$15.000	$75.000	$0.015

Pricing shown per 1M tokens. Example costs are estimates only and may vary based on actual tokenization.

Last updated: 9/13/2025

Models

Google: Gemma 3n 2B (free)*

GoogleCode GenerationExperimentalText

Parameters:nullB

Context Window:8,192

Max Output Tokens:2,048

Input ($/1M):$0.000

Output ($/1M):$0.000

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.5 Pro Experimental*

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,535

Input ($/1M):$0.000

Output ($/1M):$0.000

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.0 Flash Experimental (free)*

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:8,192

Input ($/1M):$0.000

Output ($/1M):$0.000

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 0528 (free)*

DeepseekReasoningText

Parameters:nullB

Context Window:163,840

Max Output Tokens:N/A

Input ($/1M):$0.000

Output ($/1M):$0.000

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3 0324 (free)*

DeepseekConversationalExperimentalText

Parameters:nullB

Context Window:32,768

Max Output Tokens:16,384

Input ($/1M):$0.000

Output ($/1M):$0.000

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemma 2 9B

GoogleCode GenerationExperimentalText

Parameters:nullB

Context Window:8,192

Max Output Tokens:8,192

Input ($/1M):$0.004

Output ($/1M):$0.004

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: Deepseek R1 0528 Qwen3 8B

DeepseekCode GenerationText

Parameters:nullB

Context Window:32,000

Max Output Tokens:N/A

Input ($/1M):$0.010

Output ($/1M):$0.020

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemma 3n 4B

GoogleCode GenerationExperimentalText

Parameters:nullB

Context Window:32,768

Max Output Tokens:N/A

Input ($/1M):$0.020

Output ($/1M):$0.040

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemma 3 4B

GoogleMultimodalText

Parameters:nullB

Context Window:131,072

Max Output Tokens:N/A

Input ($/1M):$0.020

Output ($/1M):$0.040

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemma 3 12B

GoogleMultimodalText

Parameters:nullB

Context Window:96,000

Max Output Tokens:8,192

Input ($/1M):$0.030

Output ($/1M):$0.030

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 1.5 Flash 8B

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,000,000

Max Output Tokens:8,192

Input ($/1M):$0.037

Output ($/1M):$0.150

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Llama 8B

DeepseekCode GenerationText

Parameters:nullB

Context Window:32,000

Max Output Tokens:32,000

Input ($/1M):$0.040

Output ($/1M):$0.040

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Llama 70B

DeepseekCode GenerationText

Parameters:nullB

Context Window:131,072

Max Output Tokens:N/A

Input ($/1M):$0.050

Output ($/1M):$0.050

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 1.5 Flash

GoogleMultimodalFrontierText

Parameters:nullB

Context Window:1,000,000

Max Output Tokens:8,192

Input ($/1M):$0.075

Output ($/1M):$0.300

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.0 Flash Lite

GoogleGeneralFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:8,192

Input ($/1M):$0.075

Output ($/1M):$0.300

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 32B

DeepseekCode GenerationFrontierText

Parameters:nullB

Context Window:131,072

Max Output Tokens:16,384

Input ($/1M):$0.075

Output ($/1M):$0.150

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemma 3 27B

GoogleMultimodalExperimentalText

Parameters:nullB

Context Window:131,072

Max Output Tokens:16,384

Input ($/1M):$0.090

Output ($/1M):$0.170

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.5 Flash Lite Preview 06-17

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,535

Input ($/1M):$0.100

Output ($/1M):$0.400

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.5 Flash Lite

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,535

Input ($/1M):$0.100

Output ($/1M):$0.400

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.0 Flash

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:8,192

Input ($/1M):$0.100

Output ($/1M):$0.400

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 7B

DeepseekCode GenerationText

Parameters:nullB

Context Window:131,072

Max Output Tokens:N/A

Input ($/1M):$0.100

Output ($/1M):$0.200

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 14B

DeepseekCode GenerationFrontierText

Parameters:nullB

Context Window:64,000

Max Output Tokens:32,000

Input ($/1M):$0.150

Output ($/1M):$0.150

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 Distill Qwen 1.5B

DeepseekReasoningFrontierText

Parameters:nullB

Context Window:131,072

Max Output Tokens:32,768

Input ($/1M):$0.180

Output ($/1M):$0.180

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3 0324

DeepseekConversationalExperimentalText

Parameters:nullB

Context Window:163,840

Max Output Tokens:163,840

Input ($/1M):$0.250

Output ($/1M):$0.850

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Anthropic: Claude 3 Haiku

AnthropicMultimodalExperimentalText

Parameters:nullB

Context Window:200,000

Max Output Tokens:4,096

Input ($/1M):$0.250

Output ($/1M):$1.250

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1 0528

DeepseekReasoningText

Parameters:nullB

Context Window:163,840

Max Output Tokens:N/A

Input ($/1M):$0.272

Output ($/1M):$0.272

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3

DeepseekCode GenerationFrontierText

Parameters:nullB

Context Window:163,840

Max Output Tokens:N/A

Input ($/1M):$0.272

Output ($/1M):$0.272

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemini 2.5 Flash

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,535

Input ($/1M):$0.300

Output ($/1M):$2.500

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: DeepSeek V3 Base

DeepseekCode GenerationFrontierText

Parameters:nullB

Context Window:163,840

Max Output Tokens:N/A

Input ($/1M):$0.302

Output ($/1M):$0.302

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: R1

DeepseekReasoningText

Parameters:nullB

Context Window:163,840

Max Output Tokens:163,840

Input ($/1M):$0.400

Output ($/1M):$2.000

Example Cost:

$0.000

In: $0.000 | Out: $0.000

DeepSeek: DeepSeek Prover V2

DeepseekReasoningText

Parameters:nullB

Context Window:163,840

Max Output Tokens:N/A

Input ($/1M):$0.500

Output ($/1M):$2.180

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Google: Gemma 2 27B

GoogleCode GenerationExperimentalText

Parameters:nullB

Context Window:8,192

Max Output Tokens:N/A

Input ($/1M):$0.650

Output ($/1M):$0.650

Example Cost:

$0.000

In: $0.000 | Out: $0.000

Anthropic: Claude 3.5 Haiku

AnthropicCode GenerationExperimentalText

Parameters:nullB

Context Window:200,000

Max Output Tokens:8,192

Input ($/1M):$0.800

Output ($/1M):$4.000

Example Cost:

$0.001

In: $0.000 | Out: $0.001

Google: Gemini 1.5 Pro

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:2,000,000

Max Output Tokens:8,192

Input ($/1M):$1.250

Output ($/1M):$5.000

Example Cost:

$0.001

In: $0.000 | Out: $0.001

Google: Gemini 2.5 Pro Preview 05-06

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,535

Input ($/1M):$1.250

Output ($/1M):$10.000

Example Cost:

$0.002

In: $0.000 | Out: $0.002

Google: Gemini 2.5 Pro Preview 06-05

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,536

Input ($/1M):$1.250

Output ($/1M):$10.000

Example Cost:

$0.002

In: $0.000 | Out: $0.002

Google: Gemini 2.5 Pro

GoogleCode GenerationFrontierText

Parameters:nullB

Context Window:1,048,576

Max Output Tokens:65,536

Input ($/1M):$1.250

Output ($/1M):$10.000

Example Cost:

$0.002

In: $0.000 | Out: $0.002

Anthropic: Claude Sonnet 4

AnthropicCode GenerationExperimentalText

Parameters:nullB

Context Window:200,000

Max Output Tokens:64,000

Input ($/1M):$3.000

Output ($/1M):$15.000

Example Cost:

$0.003

In: $0.000 | Out: $0.003

Anthropic: Claude 3.7 Sonnet

AnthropicCode GenerationExperimentalText

Parameters:nullB

Context Window:200,000

Max Output Tokens:64,000

Input ($/1M):$3.000

Output ($/1M):$15.000

Example Cost:

$0.003

In: $0.000 | Out: $0.003

Anthropic: Claude 3.5 Sonnet

AnthropicCode GenerationFrontierText

Parameters:nullB

Context Window:200,000

Max Output Tokens:8,192

Input ($/1M):$3.000

Output ($/1M):$15.000

Example Cost:

$0.003

In: $0.000 | Out: $0.003

Anthropic: Claude 3 Sonnet

AnthropicMultimodalProductionText

Parameters:nullB

Context Window:200,000

Max Output Tokens:4,096

Input ($/1M):$3.000

Output ($/1M):$15.000

Example Cost:

$0.003

In: $0.000 | Out: $0.003

Anthropic: Claude Opus 4

AnthropicCode GenerationProductionText

Parameters:nullB

Context Window:200,000

Max Output Tokens:32,000

Input ($/1M):$15.000

Output ($/1M):$75.000

Example Cost:

$0.015

In: $0.002 | Out: $0.013

Anthropic: Claude 3 Opus

AnthropicMultimodalFrontierText

Parameters:nullB

Context Window:200,000

Max Output Tokens:4,096

Input ($/1M):$15.000

Output ($/1M):$75.000

Example Cost:

$0.015

In: $0.002 | Out: $0.013

Pricing shown per 1M tokens. Example costs are estimates only and may vary based on actual tokenization.

Last updated: 9/13/2025

Download all available models and pricing data instantly.

Navigate the complex landscape of AI models with ease. This tool helps developers find the right model for their projects by comparing pricing across providers. We are also working on improving our model categorizations. If you have suggestions for how we can improve our grouping or tags/categories, please create an issue in our GitHub repo.