Pricing

Pay only for what you use. No subscription lock-ins.

Pricing catalog

Pay-as-you-go AI model pricing.

EmpirioLabs pricing varies by model and unit: tokens, images, seconds of audio or video, messages, 3D assets, and search requests. The interactive pricing table loads the current catalog, while these representative rates remain readable without client JavaScript.

Open model docs

Prices are listed and billed in USD. Checkout may show local payment methods when eligible.

Kling 3.0 Turbo

Kling AIReleased Jun 17, 2026Video Generation
Proprietary Endpoint

Text-to-video and image-to-video with synchronized native audio, at 720p or 1080p for 3 to 15 seconds, with aspect ratio and prompt control.

Type
Spec
Rate
720p
per second
$0.18
1080p
per second
$0.225

GLM 5.2

Z.aiSingaporeReleased Jun 16, 2026Ctx 1MText Generation
Proprietary Endpoint

Reasoning and coding model with a 1M token context, 128K output, adjustable reasoning effort, native web search, and tool calling.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.40
Output
per 1M generated tokens
$4.40
Web Search
per request
$0.033

GLM 5.2

Z.aiGermanyReleased Jun 16, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 21%

Reasoning and coding model with a 1M token context, 128K output, adjustable reasoning effort, native web search, and tool calling.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.40$1.10
Output
per 1M generated tokens
$4.40$3.851
Implicit cache read
per 1M cached input tokens
$0.275

Kimi K2.7 Code

Moonshot AIReleased Jun 16, 2026Ctx 256KText Generation
Proprietary Endpoint

Kimi K2.7 Code is Moonshot's trillion-parameter agentic coding model with 256K context, always-on reasoning, and text, image, and video inputs.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.95
Output
per 1M generated tokens
$4.00
Web search
per call when invoked
$0.015

Kimi K2.7 Code

Moonshot AIGermanyReleased Jun 16, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 7%

Kimi K2.7 Code is Moonshot's trillion-parameter agentic coding model with 256K context, always-on reasoning, and text, image, and video inputs.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.95$0.8939
Output
per 1M generated tokens
$4.00$3.7131
Implicit cache read
per 1M cached input tokens
$0.1788

Fugu Ultra

Sakana AIReleased Jun 21, 2026Ctx 1MText Generation
Proprietary Endpoint

Multi-agent conductor that orchestrates frontier expert models for hard reasoning, coding, and research, with 1M context, image input, and web search.

Type
Spec
Rate
Input
per 1M prompt tokens
<=272K $7.50>272K $15.00
Output
per 1M generated tokens
<=272K $45.00>272K $67.50
Implicit cache read
per 1M cached input tokens
<=272K $1.50>272K $3.00

Qwen3.7 Plus

Alibaba CloudSingaporeReleased Jun 1, 2026Ctx 1MText Generation
Proprietary Endpoint

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

Type
Spec
Rate
Input
per 1M prompt tokens
<=256K $0.40256K-1M $1.20
Output
per 1M generated tokens
<=256K $1.60256K-1M $4.80
Web Search
per call
$0.03
Image Search
per call
$0.03

Qwen3.7 Plus

Alibaba CloudChinaReleased Jun 1, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 31%

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.40<=256K $0.276$1.20256K-1M $0.826
Output
per 1M generated tokens
$1.60<=256K $1.101$4.80256K-1M $3.301
Implicit cache input
per 1M cached prompt tokens
$0.08<=256K $0.056$0.24256K-1M $0.166
Web Search
per call
$0.01
Image Search
per call
$0.01

Kimi K2.7 Code Highspeed

Moonshot AIReleased Jun 16, 2026Ctx 256KText Generation
Proprietary Endpoint

Kimi K2.7 Code Highspeed is the faster-serving tier of Moonshot's agentic coding model, with 256K context, always-on reasoning, and image and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.90
Output
per 1M generated tokens
$8.00
Web search
per call when invoked
$0.015

MiniMax M3

MiniMaxSingaporeReleased Jun 1, 2026Ctx 524KText Generation
Proprietary Endpoint
Save up to 25%

MiniMax M3 is a multimodal reasoning model for coding, agents, and long-context analysis with text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.30<=512K $0.225>512K $1.20
Output
per 1M generated tokens
$1.20<=512K $0.90>512K $4.80
Implicit cache read
per 1M cached input tokens
$0.06<=512K $0.045>512K $0.24
Linkup web search
per successful search when enabled
$0.013

Qwen3.7 Max

Alibaba CloudSingaporeReleased May 21, 2026Ctx 1MText Generation
Proprietary Endpoint

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.50
Output
per 1M generated tokens
$7.50
Web search
per call when invoked
$0.02
Web extractor
per call when invoked
$0.02
Code interpreter
per call when invoked
$0.02

Qwen3.7 Max

Alibaba CloudChinaReleased May 21, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 34%

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.50$1.65
Output
per 1M generated tokens
$7.50$4.951
Web search
per call when invoked
$0.01
Web extractor
per call when invoked
$0.01
Code interpreter
per call when invoked
$0.01

MiniMax M2.7 Highspeed

MiniMaxSingaporeReleased Mar 18, 2026Ctx 200KText Generation
Proprietary Endpoint
Save up to 50%

High-speed M2.7 variant tuned for fast inference with strong general-purpose performance with strong agentic capabilities.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.60$0.30
Output
per 1M generated tokens
$2.40$1.20
Implicit cache read
per 1M cached input tokens
$0.06$0.03
Web Search (Linkup)
per call when invoked
$0.013

TTS 1.5 Mini

InworldReleased May 5, 2026Audio Generation
Proprietary Endpoint
Save up to 30%

Sub-130ms TTFB voice synthesis with 271+ voices across 15 languages, expressive prosody, and real-time SSE streaming for low-latency voice agents.

Type
Spec
Rate
Synthesis
per 1M characters
$25.00$17.50

TTS 1.5 Max

InworldReleased May 5, 2026Audio Generation
Proprietary Endpoint
Save up to 15%

Broadcast-quality voice synthesis with rich expressive prosody, 271+ voices across 15 languages, and real-time SSE streaming with per-word timestamps.

Type
Spec
Rate
Synthesis
per 1M characters
$35.00$29.75

GLM 5.1

Z.aiChinaReleased Apr 7, 2026Ctx 202KText Generation
Proprietary Endpoint
Save up to 41%

Long-context Zhipu AI reasoning model with 202K context, 128K output, tool calling, structured output, and cache support.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.40<=32K $0.825$1.4032K-200K $1.10
Output
per 1M generated tokens
$4.40<=32K $3.301$4.4032K-200K $3.851
Implicit cache read
per 1M cached input tokens
$0.26<=32K $0.165$0.2632K-200K $0.22
Web Search (Linkup)
per call when invoked
$0.013

Kimi K2.6

Moonshot AIChinaReleased Apr 20, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 7%

Kimi K2.6 is a Moonshot multimodal reasoning model with 256K context, strong coding, and text, image, and video inputs.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.95$0.8939
Output
per 1M generated tokens
$4.00$3.7131
Implicit cache read
per 1M cached input tokens
$0.1788
Web Search (Linkup)
per call when invoked
$0.013

MiniMax M2.7

MiniMaxSingaporeReleased Mar 18, 2026Ctx 200KText Generation
Proprietary Endpoint
Save up to 50%

MiniMax M2.7 is a general-purpose reasoning chat model with interleaved thinking, function calling, and prompt caching.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.30$0.15
Output
per 1M generated tokens
$1.20$0.60
Implicit cache read
per 1M cached input tokens
$0.06$0.03
Web Search (Linkup)
per call when invoked
$0.013

Qwen3.5 122B-A10B

Alibaba CloudChinaReleased Feb 24, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 71%

Qwen3.5 122B-A10B is a multimodal reasoning model with 256K context, efficient sparse MoE inference, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.40<=128K $0.115$0.40128K-256K $0.287
Output
per 1M generated tokens
$3.20<=128K $0.917$3.20128K-256K $2.294
Web search
per request when enabled
$0.01

Qwen3.5 397B-A17B

Alibaba CloudChinaReleased Feb 16, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 71%

Qwen3.5 397B-A17B is a flagship multimodal reasoning model for language, code, agents, GUI tasks, and image and video understanding.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.60<=128K $0.172$0.60128K-256K $0.43
Output
per 1M generated tokens
$3.60<=128K $1.032$3.60128K-256K $2.58
Web search
per request when enabled
$0.01

Qwen3.5 35B-A3B

Alibaba CloudChinaReleased Feb 24, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 77%

Qwen3.5 35B-A3B is an efficient native vision-language model with sparse MoE routing, deep thinking, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.25<=128K $0.057$0.25128K-256K $0.229
Output
per 1M generated tokens
$2.00<=128K $0.459$2.00128K-256K $1.835
Web search
per request when enabled
$0.01

Qwen3.5 27B

Alibaba CloudChinaReleased Feb 24, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 71%

Qwen3.5 27B is a dense multimodal reasoning model with fast responses, 256K context, and text, image, and video understanding.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.30<=128K $0.086$0.30128K-256K $0.258
Output
per 1M generated tokens
$2.40<=128K $0.688$2.40128K-256K $2.064
Web search
per request when enabled
$0.01

Qwen3.6 27B

Alibaba CloudChinaReleased Apr 22, 2026Ctx 256KText Generation
Proprietary Endpoint
Save up to 31%

Qwen3.6 27B improves agentic coding, STEM reasoning, spatial vision, OCR, and text, image, and video understanding on 256K context.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.60$0.412564
Output
per 1M generated tokens
$3.60$2.475384
Web search
per request when enabled
$0.01

Qwen3.6 Flash

Alibaba CloudSingaporeReleased Apr 16, 2026Ctx 1MText Generation
Proprietary Endpoint

Fast Qwen3.6 vision-language model for agentic coding, math reasoning, spatial understanding, OCR, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
<=256K $0.25256K-1M $1.00
Output
per 1M generated tokens
<=256K $1.50256K-1M $4.00
Web search
per query when enabled
$0.02

Qwen3.6 Flash

Alibaba CloudChinaReleased Apr 16, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 34%

Fast Qwen3.6 vision-language model for agentic coding, math reasoning, spatial understanding, OCR, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.25<=256K $0.165$1.00256K-1M $0.66
Output
per 1M generated tokens
$1.50<=256K $0.99$4.00256K-1M $3.961
Web search
per query when enabled
$0.01

GLM 4.7 Flash

Z.aiSingaporeReleased Jan 19, 2026Ctx 200KText Generation
Proprietary Endpoint

Free lightweight GLM-4.7 text model for coding, reasoning, long-context writing, and general chat.

Type
Spec
Rate
Input
per 1M prompt tokens
Free
Output
per 1M generated tokens
Free
Implicit cache read
per 1M cached input tokens
Free
Web Search
per request when enabled
$0.033

GLM 4.5 Flash

Z.aiSingaporeReleased Jul 28, 2025Ctx 200KText Generation
Proprietary Endpoint

Free lightweight GLM-4.5 text model for reasoning, coding, long-form chat, and general language tasks.

Type
Spec
Rate
Input
per 1M prompt tokens
Free
Output
per 1M generated tokens
Free
Implicit cache read
per 1M cached input tokens
Free
Web Search
per request when enabled
$0.033

GLM 4.6V Flash

Z.aiSingaporeReleased Dec 8, 2025Ctx 128KText Generation
Proprietary Endpoint

Free multimodal GLM-4.6V model for image, video, file, and text understanding with native function calling.

Type
Spec
Rate
Input
per 1M prompt tokens
Free
Output
per 1M generated tokens
Free
Implicit cache read
per 1M cached input tokens
Free
Web Search
per request when enabled
$0.033

Amazon Nova Canvas

AmazonReleased Dec 3, 2024Image Generation
Proprietary Endpoint

Image generation and editing model creating and modifying images from text or image inputs, with inpainting, virtual try-on, and style controls.

Type
Spec
Rate
Small Standard (≤1024×1024)
per image
$0.12
Small Premium (≤1024×1024)
per image
$0.18
Large Standard (≤2048×2048)
per image
$0.18
Large Premium (≤2048×2048)
per image
$0.24

Amazon Nova Reel 1.1

AmazonReleased Apr 7, 2025Video Generation
Proprietary Endpoint

Video generation model producing up to 2-minute multi-shot videos from text and optional image prompts with improved quality and consistency.

Type
Spec
Rate
Per Second
per second
$0.14

Deepgram Nova 3

DeepgramReleased Feb 12, 2025Transcription
Proprietary Endpoint

Speech-to-text transcription using the Nova-3 model with multi-language support and advanced customizable settings for production workloads.

Type
Spec
Rate
Transcription
per minute of audio
$0.014

DeepSeek Prover V2

DeepSeekReleased Apr 30, 2025Text Generation
Proprietary Endpoint

Open-source LLM specialized in formal theorem proving in Lean 4, built on a recursive theorem-proving pipeline.

Type
Spec
Rate
Per Message
fixed
$0.020

DeepSeek V3.2

DeepSeekSingaporeReleased Dec 1, 2025Ctx 128KText Generation
Proprietary Endpoint

Open-source Mixture-of-Experts LLM tuned for high-efficiency reasoning, coding, and general language tasks across long-form prompts.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.57
Output
per 1M generated tokens
$1.71
Web Search
per call
$0.015

DeepSeek V4 Flash

DeepSeekGermanyReleased Apr 24, 2026Ctx 1MText Generation
Proprietary Endpoint

Lightweight MoE model with 284B total / 13B active parameters and native 1M context, tuned for low-latency, cost-effective high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.14
Output
per 1M generated tokens
$0.28
Web Search (Linkup)
per call when invoked
$0.013

DeepSeek V4 Flash

DeepSeekSingaporeReleased Apr 24, 2026Ctx 1MText Generation
Proprietary Endpoint

Lightweight MoE model with 284B total / 13B active parameters and native 1M context, tuned for low-latency, cost-effective high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.20
Output
per 1M generated tokens
$0.40
Web search
per request when enabled
$0.02

DeepSeek V4 Flash

DeepSeekChinaReleased Apr 24, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 2%

Lightweight MoE model with 284B total / 13B active parameters and native 1M context, tuned for low-latency, cost-effective high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.14$0.138
Output
per 1M generated tokens
$0.28$0.275
Implicit cache read
per 1M cached input tokens
$0.028
Web search
per request when enabled
$0.01

DeepSeek V4 Pro

DeepSeekGermanyReleased Apr 24, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 5%

Flagship MoE LLM with 1.6T total / 49B active parameters and native 1M context for advanced math, logical inference, and specialized coding.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.74$1.65
Output
per 1M generated tokens
$3.48$3.30
Web Search (Linkup)
per call when invoked
$0.013

DeepSeek V4 Pro

DeepSeekSingaporeReleased Apr 24, 2026Ctx 1MText Generation
Proprietary Endpoint

Flagship MoE LLM with 1.6T total / 49B active parameters and native 1M context for advanced math, logical inference, and specialized coding.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.40
Output
per 1M generated tokens
$4.80
Web search
per request when enabled
$0.02

DeepSeek V4 Pro

DeepSeekChinaReleased Apr 24, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 5%

Flagship MoE LLM with 1.6T total / 49B active parameters and native 1M context for advanced math, logical inference, and specialized coding.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.74$1.65
Output
per 1M generated tokens
$3.48$3.301
Implicit cache read
per 1M cached input tokens
$0.138
Web search
per request when enabled
$0.01

Exa Answer

ExaResearch & Search
Proprietary Endpoint

Quick LLM-style answer to a natural-language question, grounded in fresh Exa web search results with inline citations and source links.

Type
Spec
Rate
Answer
per request
$0.01

Gemini 2.5 Flash TTS

GoogleReleased May 20, 2025Audio Generation
Proprietary Endpoint

Low-latency text-to-speech with single- and multi-speaker voices and controllable style, accent, and expressive tone for production apps.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.50
Output
per 1M generated tokens
$30.00

Gemini 2.5 Pro TTS

GoogleReleased May 20, 2025Audio Generation
Proprietary Endpoint

High-quality TTS preview for podcasts, audiobooks, and customer support, with expressive multi-speaker voices across 23+ languages.

Type
Spec
Rate
Input
per 1M prompt tokens
$3.00
Output
per 1M generated tokens
$60.00

Gemini 3.1 Flash TTS

GoogleReleased Apr 13, 2026Audio Generation
Proprietary Endpoint

Highly controllable TTS with new Audio Tags for precise style, tone, pace, and delivery across narration, assistants, and voice apps.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.60
Output
per 1M generated tokens
$52.00

Gemma 3 27B

GoogleReleased Mar 10, 2025Ctx 128KText Generation
Proprietary Endpoint

Open-source vision-language model with 128K context, 140+ languages, improved math/reasoning, structured outputs, and function calling.

Type
Spec
Rate
Per Message
fixed
$0.0040
Web Search (Linkup)
per call when invoked
$0.013

GPTZero

GPTZeroTools & Agents
Proprietary Endpoint

Deep-learning detector that flags portions of text likely generated by AI versus human, classifying content as entirely human, AI, or mixed.

Type
Spec
Rate
Text Scan
per 1,000 words
$0.39

HappyHorse 1.0

Alibaba CloudSingaporeReleased May 6, 2026Video Generation
Proprietary Endpoint

Video model offering Text-to-Video, Image-to-Video, Reference-to-Video, and Video Edit modes with high-fidelity, motion-smooth output.

Type
Spec
Rate
All Modes 720P
per second
$0.14
All Modes 1080P
per second
$0.24

Hunyuan Image 3

TencentReleased Sep 28, 2025Image Generation
Proprietary Endpoint

Open-source text-to-image model on a multimodal Mixture-of-Experts architecture with photorealistic detail and strong multilingual text rendering.

Type
Spec
Rate
Standard
per image
$0.13

Janus-Pro DeepSeek

DeepSeekReleased Jan 27, 2025Image Generation
Proprietary Endpoint

Autoregressive framework on the Janus Pro 7B model that unifies multimodal understanding and image generation in one architecture.

Type
Spec
Rate
Image Generation
per image
$0.030
Image Analysis
per uploaded image
$0.030

Kling O3

Kling AIReleased Feb 5, 2026Video Generation
Proprietary Endpoint

Video model in Standard or Pro modes with Text-to-Video, Image-to-Video, Reference-to-Video, editing, native sound, and multi-scene transitions.

Type
Spec
Rate
Standard T2V/I2V
per second
$0.168
Standard T2V/I2V Sound
per second
$0.224
Standard Video Input
per second
$0.252
Pro T2V/I2V
per second
$0.224
Pro T2V/I2V Sound
per second
$0.280
Pro Video Input
per second
$0.336
4K T2V/I2V/Ref
per second
$0.525

Kling v3 Motion Control

Kling AIVideo Generation
Proprietary Endpoint

Kling 3.0 model that transfers motion from a reference video onto a character from a reference image, with Standard 720p and Pro 1080p tiers.

Type
Spec
Rate
Standard (720p)
per second
$0.14
Pro (1080p)
per second
$0.18

Linkup Standard

LinkupCtx 100KResearch & Search
Proprietary Endpoint

AI-powered web search with detailed overviews and answers, faster than Deep Search. Ranks #1 on OpenAI SimpleQA benchmark.

Type
Spec
Rate
Per Message
fixed
$0.013

Magistral Medium 2509 Thinking

Mistral AIReleased Sep 17, 2025Ctx 40KText Generation
Proprietary Endpoint

Reasoning model tuned for tasks needing longer thought and higher accuracy: legal research, financial forecasting, software, and storytelling.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.60
Output
per 1M generated tokens
$6.50
Web Search (Linkup)
per call when invoked
$0.013

Mistral Medium 3

Mistral AIReleased May 7, 2025Ctx 130KText Generation
Proprietary Endpoint

Cost-efficient language model offering strong reasoning and multimodal performance for general production workloads at competitive latency.

Type
Spec
Rate
Per Message
fixed
$0.015
Web Search (Linkup)
per call when invoked
$0.013

Mistral Medium 3.1

Mistral AIReleased Aug 12, 2025Ctx 131KText Generation
Proprietary Endpoint

Enterprise-grade model with strong reasoning, coding, and STEM performance, supporting hybrid, on-prem, and in-VPC deployments.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.52
Output
per 1M generated tokens
$2.60
Web Search (Linkup)
per call when invoked
$0.013

Mistral Small 3.1

Mistral AIReleased Mar 17, 2025Ctx 128KText Generation
Proprietary Endpoint

24B-parameter multimodal model with 128K context for image analysis, programming, math, and multilingual tasks, tuned for efficient local inference.

Type
Spec
Rate
Per Message
fixed
$0.0019
Web Search (Linkup)
per call when invoked
$0.013

Mistral Small 4

Mistral AIReleased Mar 16, 2026Ctx 256KText Generation
Proprietary Endpoint

Hybrid model unifying Instruct, Reasoning (Magistral), and Devstral families: 40% lower completion time and 3x throughput vs Small 3.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.15
Output
per 1M generated tokens
$0.60
Standard Web Search
per call
$0.084
Premium Web Search
per call
$0.140
Code Interpreter
per call
$0.084
Image Generation
per image
$0.280

Nova Lite 1.0

AmazonReleased Dec 3, 2024Ctx 300KText Generation
Proprietary Endpoint

Low-cost multimodal foundation model for text, images, and video on a 300K context (up to ~30 min video), tuned for speed and affordability.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.069
Output
per 1M generated tokens
$0.28
Cached input
per 1M tokens
$0.0386
Web Search (Linkup)
per call when invoked
$0.013

Nova Lite 2

AmazonReleased Dec 2, 2025Ctx 1MText Generation
Proprietary Endpoint

Fast, cost-effective multimodal reasoning model for text, images, documents, and video on a 1M context (long docs and ~90 min clips).

Type
Spec
Rate
Input
per 1M prompt tokens
$0.38
Output
per 1M generated tokens
$3.16
Cached input
per 1M tokens
$0.2128
Web Search (Linkup)
per call when invoked
$0.013

Nova Micro 1.0

AmazonReleased Dec 3, 2024Ctx 128KText Generation
Proprietary Endpoint

Text-only foundation model tuned for ultra-low latency and cost on 128K context. Strong for summarization, translation, and chat with 44% cache discount.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.040
Output
per 1M generated tokens
$0.16
Cached input
per 1M tokens
$0.0224
Web Search (Linkup)
per call when invoked
$0.013

Nova Premier 1.0

AmazonReleased Apr 30, 2025Ctx 1MText Generation
Proprietary Endpoint

Most capable model in the family. Multimodal text/image/video on a 1M context with chain-of-thought reasoning across tools and data sources.

Type
Spec
Rate
Input
per 1M prompt tokens
$3.00
Output
per 1M generated tokens
$15.00
Cached input
per 1M tokens
$1.68
Web Search (Linkup)
per call when invoked
$0.013

Nova Pro 1.0

AmazonReleased Dec 3, 2024Ctx 300KText Generation
Proprietary Endpoint

Multimodal foundation model balancing accuracy, speed, and cost for text, images, and video on 300K context (up to ~30 min video).

Type
Spec
Rate
Input
per 1M prompt tokens
$2.40
Output
per 1M generated tokens
$9.60
Latency Optimized Input
per 1M prompt tokens
$3.00
Latency Optimized Output
per 1M generated tokens
$12.00
Web Search (Linkup)
per call when invoked
$0.013

OpenAI Whisper 1

OpenAIReleased Sep 21, 2022Transcription
Proprietary Endpoint

Whisper-1 speech-to-text transcription trained on multilingual supervised audio, with a 25 MB upload limit per file.

Type
Spec
Rate
Per Minute of Audio
per minute
$0.030

Perplexity Advanced Deep Research

PerplexityResearch & Search
Proprietary Endpoint

Institutional-grade research powered by Claude Opus 4.6 reasoning, with maximum depth, enhanced tool access, and extensive source coverage.

Type
Spec
Rate
Input
per 1M prompt tokens
$12.00
Output
per 1M generated tokens
$60.00
Web Search Call
per call
$0.012
URL Fetch Call
per call
$0.0012

Perplexity Deep Research

PerplexityCtx 128KResearch & Search
Proprietary Endpoint

Research model for multi-step retrieval, synthesis, and reasoning, autonomously searching, reading, and evaluating sources across complex topics.

Type
Spec
Rate
Input
per 1M prompt tokens
$4.80
Output
per 1M generated tokens
$19.00
Citation Tokens
per 1M tokens
$4.80
Reasoning Tokens
per 1M tokens
$7.20
Search Queries
per query
$0.012

Perplexity Sonar

PerplexityCtx 127KResearch & Search
Proprietary Endpoint

Real-time web-connected search with accurate citations and customizable sources for up-to-date AI search integration in production apps.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.40
Output
per 1M generated tokens
$2.40
Base Fee (Low Context)
per request
$0.012
Base Fee (Medium Context)
per request
$0.019
Base Fee (High Context)
per request
$0.029

Perplexity Sonar Pro

PerplexityCtx 200KResearch & Search
Proprietary Endpoint

Search-grounded model with double the citations and a larger context window, tuned for complex queries needing in-depth, nuanced answers.

Type
Spec
Rate
Input
per 1M prompt tokens
$7.20
Output
per 1M generated tokens
$36.00
Base Fee (Low Context)
per request
$0.014
Base Fee (Medium Context)
per request
$0.024
Base Fee (High Context)
per request
$0.034

Perplexity Sonar Reasoning Pro

PerplexityCtx 128KResearch & Search
Proprietary Endpoint

Reasoning model on the uncensored open-source R1-1776 with web search, outperforming leading search engines and LLMs on the SimpleQA benchmark.

Type
Spec
Rate
Input
per 1M prompt tokens
$4.80
Output
per 1M generated tokens
$19.00
Base Fee (Low Context)
per request
$0.014
Base Fee (Medium Context)
per request
$0.024
Base Fee (High Context)
per request
$0.034

Pixverse v5

PixVerseReleased Aug 29, 2025Video Generation
Proprietary Endpoint

Cinematic video generation in Text-to-Video, Image-to-Video, and Transition modes with high detail, fluid motion, and lifelike animations.

Type
Spec
Rate
360p/540p 5s
per video
$0.45
360p/540p 8s
per video
$0.90
720p 5s
per video
$0.60
720p 8s
per video
$1.20
1080p 5s
per video
$1.20

Pixverse v5.6

PixVerseReleased Jan 26, 2026Video Generation
Proprietary Endpoint

Generates videos from text or 1-2 frame image prompts up to 1080p, multiple aspect ratios, 5-10s durations, with optional synchronized audio.

Type
Spec
Rate
360p/540p 5s no audio
per video
$0.40
360p/540p 5s audio
per video
$0.80
360p/540p 8s no audio
per video
$0.80
360p/540p 8s audio
per video
$1.60
360p/540p 10s no audio
per video
$0.88
360p/540p 10s audio
per video
$1.76
720p 5s no audio
per video
$0.65
720p 5s audio
per video
$1.30
720p 8s no audio
per video
$1.30
720p 8s audio
per video
$2.60
720p 10s no audio
per video
$1.43
720p 10s audio
per video
$2.86
1080p 5s no audio
per video
$0.75
1080p 5s audio
per video
$1.50
1080p 8s no audio
per video
$1.50
1080p 8s audio
per video
$3.00

Qwen Image 2.0

Alibaba CloudSingaporeReleased Mar 3, 2026Image Generation
Proprietary Endpoint
Save up to 8%

Unified image generation and editing model with class-leading complex Chinese/English text rendering, realistic textures, and multi-image fusion.

Type
Spec
Rate
Standard
per image
$0.035$0.0322
Pro
per image
$0.075$0.069

Qwen3.5 Flash

Alibaba CloudSingaporeReleased Feb 24, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 10%

Vision-language model with hybrid linear-attention plus sparse MoE, 1M context, and fast multimodal text/image/video inference.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.10$0.090
Output
per 1M generated tokens
$0.40$0.368
Web Search
per call
$0.015
Image Search
per call
$0.012

Qwen3.5 Flash

Alibaba CloudChinaReleased Feb 24, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 68%

Vision-language model with hybrid linear-attention plus sparse MoE, 1M context, and fast multimodal text/image/video inference.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.090<=128K $0.029128K-256K $0.115256K-1M $0.172
Output
per 1M generated tokens
$0.368<=128K $0.287128K-256K $1.147256K-1M $1.72
Web search
per query when enabled
$0.01

Qwen3.5 Omni Flash

Alibaba CloudSingaporeReleased Mar 30, 2026Ctx 256KText Generation
Proprietary Endpoint

Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.

Type
Spec
Rate
Input
per 1M prompt tokens
per 1M prompt tokens $0.40per 1M prompt tokens $3.00
Output
per 1M generated tokens
per 1M generated tokens $2.20per 1M generated tokens $11.90
Web Search
per request
$0.015

Qwen3.5 Omni Plus

Alibaba CloudSingaporeReleased Mar 30, 2026Ctx 256KText Generation
Proprietary Endpoint

Flagship omni-modal model for text, image, audio, and video. 3h audio, 1h video, 90+ input and 30+ output languages, 55 voice timbres.

Type
Spec
Rate
Input
per 1M prompt tokens
per 1M prompt tokens $1.40per 1M prompt tokens $11.00
Output
per 1M generated tokens
per 1M generated tokens $8.30per 1M generated tokens $44.00
Web Search
per request
$0.015

Qwen3.5 Plus

Alibaba CloudSingaporeReleased Feb 16, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 10%

Multimodal model with hybrid architecture for efficient deep thinking and visual understanding across text, image, and video on a 1M context.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.40<=256K $0.36$1.20256K-1M $1.08
Output
per 1M generated tokens
$2.40<=256K $2.21$7.20256K-1M $6.62
Web Search
per call
$0.015
Image Search
per call
$0.012

Qwen3.5 Plus

Alibaba CloudChinaReleased Feb 16, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 69%

Multimodal model with hybrid architecture for efficient deep thinking and visual understanding across text, image, and video on a 1M context.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.36<=128K $0.115$0.36128K-256K $0.287$1.08256K-1M $0.573
Output
per 1M generated tokens
$2.21<=128K $0.688$2.21128K-256K $1.72$6.62256K-1M $3.44
Web search
per query when enabled
$0.01

Qwen3.6 Max Preview

Alibaba CloudSingaporeReleased Apr 20, 2026Ctx 256KText Generation
Proprietary Endpoint

Largest preview variant in the 3.6 series (text-only): improved coding agent execution, stronger front-end skills, and broader long-tail knowledge.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $1.31128K-256K $1.97
Output
per 1M generated tokens
<=128K $7.88128K-256K $11.82
Web Search
per call
$0.020

Qwen3.6 Plus

Alibaba CloudSingaporeReleased Apr 2, 2026Ctx 1MText Generation
Proprietary Endpoint

Vision-language model with major upgrades over 3.5: agentic and front-end coding, multimodal recognition, OCR, and object localization.

Type
Spec
Rate
Input
per 1M prompt tokens
<=256K $0.50256K-1M $2.00
Output
per 1M generated tokens
<=256K $3.00256K-1M $6.00
Web Search
per call
$0.026
Image Search
per call
$0.0208

Qwen3.6 Plus

Alibaba CloudChinaReleased Apr 2, 2026Ctx 1MText Generation
Proprietary Endpoint
Save up to 45%

Vision-language model with major upgrades over 3.5: agentic and front-end coding, multimodal recognition, OCR, and object localization.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.50<=256K $0.276$2.00256K-1M $1.101
Output
per 1M generated tokens
$3.00<=256K $1.651256K-1M $6.602
Web search
per query when enabled
$0.01

Qwen3 Max

Alibaba CloudSingaporeReleased Sep 23, 2025Ctx 256KText Generation
Proprietary Endpoint
Save up to 10%

256K-context flagship with major improvements in reasoning, instruction following, and multilingual support, plus higher coding/math accuracy.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.20<=32K $1.08$2.4032K-128K $2.16$3.00128K-256K $2.70
Output
per 1M generated tokens
$6.00<=32K $5.52$12.0032K-128K $11.04$15.00128K-256K $13.80
Web Search
per request
$0.015

Qwen3 Max Preview

Alibaba CloudSingaporeReleased Sep 5, 2025Ctx 256KText Generation
Proprietary Endpoint
Save up to 20%

Preview release with major gains over the 2.5 series in Chinese-English understanding, complex instructions, multilingual ability, and tool use.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.20<=32K $1.08$2.4032K-128K $2.16$3.00128K-256K $2.70
Output
per 1M generated tokens
$6.00<=32K $4.80$12.0032K-128K $9.60$15.00128K-256K $12.00

Qwen3 Max Thinking

Alibaba CloudSingaporeReleased Sep 23, 2025Ctx 256KText Generation
Proprietary Endpoint
Save up to 10%

Reasoning model with adaptive tool use (search, memory, code interpreter) and test-time scaling for higher accuracy on complex tasks.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.20<=32K $1.08$2.4032K-128K $2.16$3.00128K-256K $2.70
Output
per 1M generated tokens
$6.00<=32K $5.52$12.0032K-128K $11.04$15.00128K-256K $13.80
Web Search
per request
$0.015

Qwen3 Rerank

Alibaba CloudSingaporeReleased Jun 5, 2025Ctx 4000Rerankers
Proprietary Endpoint

Semantic document reranker. Sorts up to 500 candidates per query by relevance, supports 100+ languages, and accepts a custom sorting instruction.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.10

Seed 2.0 Code

ByteDanceMalaysiaReleased Feb 14, 2026Ctx 256KText Generation
Proprietary Endpoint

Coding-tuned 256K-context model with strong front-end results and multilingual programming support for AI coding tools and agents.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.40128K-256K $0.80
Output
per 1M generated tokens
<=128K $2.40128K-256K $4.80

Seed 2.0 Lite

ByteDanceMalaysiaReleased Feb 14, 2026Ctx 256KText Generation
Proprietary Endpoint

Balanced general-purpose model for high-frequency enterprise workloads: information processing, content, search, and data analysis.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.31128K-256K $0.62
Output
per 1M generated tokens
<=128K $2.50128K-256K $5.00

Seed 2.0 Mini

ByteDanceMalaysiaReleased Feb 14, 2026Ctx 256KText Generation
Proprietary Endpoint

Latency-focused multimodal model with 256K context, four reasoning effort modes, and image/video understanding for high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.12128K-256K $0.24
Output
per 1M generated tokens
<=128K $0.50128K-256K $1.00

Seed 2.0 Pro

ByteDanceMalaysiaReleased Feb 14, 2026Ctx 256KText Generation
Proprietary Endpoint

Flagship general model with 256K context for complex reasoning, multimodal understanding, structured generation, and tool-augmented execution.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.63128K-256K $1.26
Output
per 1M generated tokens
<=128K $3.79128K-256K $7.58

Seedance 2.0 Fast

ByteDanceMalaysiaReleased Feb 12, 2026Video Generation
Proprietary Endpoint

Speed-optimized 2.0 video variant for cinematic clips with native audio sync, camera control, and stable motion at lower cost per render.

Type
Spec
Rate
T2V/I2V 480P
per second
$0.122
T2V/I2V 720P
per second
$0.260
Video Input 480P
per second
$0.284
Video Input 720P
per second
$0.610

Seedance 2.0 Pro

ByteDanceMalaysiaReleased Feb 12, 2026Video Generation
Proprietary Endpoint

Multimodal video model for cinematic output from text, image, audio, or video inputs, with stable motion and consistent characters.

Type
Spec
Rate
T2V/I2V 480P
per second
$0.139
T2V/I2V 720P
per second
$0.300
T2V/I2V 1080P
per second
$0.749
T2V/I2V 4K
per second
$1.555
Video Input 480P
per second
$0.342
Video Input 720P
per second
$0.736
Video Input 1080P
per second
$1.841
Video Input 4K
per second
$3.732

Seedream 5.0 Lite

ByteDanceMalaysiaReleased Feb 13, 2026Image Generation
Proprietary Endpoint

Unified multimodal image model that reasons through prompts before rendering, producing high-resolution and consistent edits and brand visuals.

Type
Spec
Rate
Standard
per image
$0.0350

Stable Audio 2.0

Stability AIReleased Apr 3, 2024Audio Generation
Proprietary Endpoint

Generates audio up to 3 minutes from text prompts, supporting text-to-audio and audio-to-audio with adjustable duration, steps, and CFG scale.

Type
Spec
Rate
Base Cost
per generation
$0.58
Per Step Cost
per step
$0.00

Stable Audio 2.5

Stability AIReleased Sep 10, 2025Audio Generation
Proprietary Endpoint

Up-to-3-minute audio from text with text-to-audio, audio-to-audio, and audio inpainting for music production, sound design, and remixing.

Type
Spec
Rate
Generation
per generation
$0.68

Tavily Research

TavilyResearch & Search
Proprietary Endpoint

Multi-search research assistant that explores a topic, analyzes sources, and produces a detailed research report with citations.

Type
Spec
Rate
Mini
average per task
~$1.19
Pro
average per task
~$2.75

Text Embedding v4

Alibaba CloudSingaporeReleased Jun 4, 2025Ctx 8192Embeddings
Proprietary Endpoint

Multilingual text embedding with selectable output dimensions (64–2048). Up to 8,192 tokens per input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.07

Tongyi Embedding Vision Flash

Alibaba CloudSingaporeReleased Sep 23, 2025Ctx 1024Embeddings
Proprietary Endpoint

Speed-optimised multimodal embedding — same shape as Vision-Plus, 3× cheaper image/video tokens.

Type
Spec
Rate
Text input
per 1M tokens
$0.09
Image / video input
per 1M tokens
$0.03

Tongyi Embedding Vision Plus

Alibaba CloudSingaporeReleased Sep 23, 2025Ctx 1024Embeddings
Proprietary Endpoint

Multimodal embedding producing independent vectors for text, image, and video inputs.

Type
Spec
Rate
Text input
per 1M tokens
$0.09
Image / video input
per 1M tokens
$0.09

Wan 2.6

Alibaba CloudSingaporeReleased Jan 12, 2026Video Generation
Proprietary Endpoint
Save up to 10%

Multimodal video generation model for cinematic, multi-shot stories with native audio-visual sync (lip-sync, dialogue, music, SFX).

Type
Spec
Rate
Standard 720P
per second
$0.10$0.09
Standard 1080P
per second
$0.15$0.138
Flash 720P (audio)
per second
$0.050$0.045
Flash 720P (no audio)
per second
$0.0250$0.0225
Flash 1080P (audio)
per second
$0.0750$0.069
Flash 1080P (no audio)
per second
$0.03750$0.0345

Wan 2.7

Alibaba CloudSingaporeReleased Apr 26, 2026Video Generation
Proprietary Endpoint

Multimodal video model supporting T2V, I2V, video editing, and reference-to-video, with high-fidelity output from text, image, or video inputs.

Type
Spec
Rate
All Modes 720P
per second
$0.10
All Modes 1080P
per second
$0.150

Wan2.7 Image

Alibaba CloudSingaporeReleased Apr 1, 2026Image Generation
Proprietary Endpoint

Image generation and editing companion model: text-to-image, bounding-box edits, and cohesive image sets, with up to 4K output on Pro.

Type
Spec
Rate
Standard
per image
$0.030
Pro
per image
$0.075

Manus

ManusTools & Agents
Proprietary Endpoint

Autonomous AI agent that turns a high-level prompt into subtasks, calls tools and APIs, and delivers end-to-end results without manual orchestration.

Type
Spec
Rate
Adaptive - Manus 1.6 Lite
per task
$1.44 - $2.63
Adaptive - Manus 1.6
per task
$2.89 - $5.25
Adaptive - Manus 1.6 Max
per task
$5.25 - $9.19

Grok Imagine Video 1.5

xAIReleased Jun 16, 2026Video Generation
Proprietary Endpoint

Image-to-video model that animates a source image with prompt-guided motion, up to 15 seconds at 480p or 720p across seven aspect ratios.

Type
Spec
Rate
Image input
per image
$0.05
480p
per second
$0.096
720p
per second
$0.168

MiMo V2.5 Pro

XiaomiReleased Apr 27, 2026Ctx 1MText Generation
Proprietary Endpoint

Top-tier model for agentic workflows, complex software engineering, and long-horizon tasks, sustaining work across 1000+ tool calls on 1M context.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.175
Output
per 1M generated tokens
$4.35
Implicit cache read
per 1M cached input tokens
$0.018
Web Search
per call
$0.015

MiMo V2.5

XiaomiReleased Apr 22, 2026Ctx 1MText Generation
Proprietary Endpoint

Multimodal model with native visual and audio understanding on a 1M context, designed to reason and act across modalities in agentic workflows.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.70
Output
per 1M generated tokens
$1.40
Implicit cache read
per 1M cached input tokens
$0.014
Web Search
per call
$0.015

HappyHorse 1.1

Alibaba CloudSingaporeReleased Jun 22, 2026Video Generation
Proprietary Endpoint

Text, image, and reference-to-video in one model. Cinematic motion, character consistency across up to 9 references, and synchronized native audio.

Type
Spec
Rate
720p
per second
$0.14
1080p
per second
$0.18

Seedance 2.0 Mini

ByteDanceMalaysiaReleased Jun 15, 2026Video Generation
Proprietary Endpoint

The fastest, most affordable Seedance 2.0 tier for short cinematic clips with native audio, camera control, and image or video inputs at 480p and 720p.

Type
Spec
Rate
T2V/I2V 480P
per second
$0.070
T2V/I2V 720P
per second
$0.150
Video Input 480P
per second
$0.167
Video Input 720P
per second
$0.359

98 of 124 models

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.