chore(pricing): Update vertex-ai pricing by siddharthsambharia-portkey · Pull Request #550 · Portkey-AI/models

siddharthsambharia-portkey · 2026-03-17T12:15:04Z

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type	Count
➕ Models added	0
🔄 Models updated (merged)	9

🔄 Updated Models

gemini-2.5-pro
gemini-3.1-pro-preview
gemini-3.1-flash-image-preview
gemini-3.1-flash-lite-preview
gemini-3-pro-image-preview
gemini-3-flash-preview
veo-3.1-fast-generate-001
veo-3.0-fast-generate-preview
gemini-embedding-2-preview

Model → Pricing Page Mapping

Model ID	Publisher / Section	Source	Notes
`gemini-2.5-pro`	Google – Gemini 2.5	API	Standard input $1.25/output $10 (≤200K); tiered pricing above 200K
`gemini-2.5-computer-use-preview-10-2025`	Google – Gemini 2.5	API	Matches "Gemini 2.5 Pro Computer Use-Preview" row
`gemini-2.5-flash`	Google – Gemini 2.5	API	Input $0.30, output $2.50
`gemini-2.5-flash-preview-09-2025`	Google – Gemini 2.5	API	Preview alias; uses gemini-2.5-flash pricing
`gemini-2.5-flash-image`	Google – Gemini 2.5	API	Matches "Gemini 2.5 Flash Image"; image_token $30/1M
`gemini-2.5-flash-lite`	Google – Gemini 2.5	API	Input $0.10, output $0.40
`gemini-2.5-flash-lite-preview-09-2025`	Google – Gemini 2.5	API	Preview alias; uses gemini-2.5-flash-lite pricing
`gemini-2.0-flash-001`	Google – Gemini 2.0	API	Input $0.15, output $0.60
`gemini-2.0-flash-lite-001`	Google – Gemini 2.0	API	Input $0.075, output $0.30
`gemini-3.1-pro-preview`	Google – Gemini 3	API	Input $2, output $12 (≤200K)
`gemini-3.1-flash-image-preview`	Google – Gemini 3	API	Input $0.50, output $3; image_token $60/1M
`gemini-3.1-flash-lite-preview`	Google – Gemini 3	API	Input $0.25, output $1.50
`gemini-3-pro-preview`	Google – Gemini 3	API	Input $2, output $12 (≤200K)
`gemini-3-pro-image-preview`	Google – Gemini 3	API	Input $2, output $12; image_token $120/1M
`gemini-3-flash-preview`	Google – Gemini 3	API	Input $0.50, output $3
`imagen-4.0-generate-001`	Google – Imagen	API	$0.04/image; row matched via lookup_variant `imagen-4.0-generate`
`imagen-4.0-fast-generate-001`	Google – Imagen	API	$0.02/image; row matched via `imagen-4.0-fast-generate`
`imagen-4.0-ultra-generate-001`	Google – Imagen	API	$0.06/image; row matched via `imagen-4.0-ultra-generate`
`imagen-3.0-generate-002`	Google – Imagen	API	$0.04/image; row matched via `imagen-3.0-generate`
`imagen-3.0-capability-001`	Google – Imagen	API	$0.04/image; uses imagen-3.0-generate pricing (capability shares generate price)
`imagen-3.0-capability-002`	Google – Imagen	API	$0.04/image; uses imagen-3.0-generate pricing (capability shares generate price)
`veo-3.1-generate-001`	Google – Veo	API	$0.20/sec (720p/1080p video); row matched via `veo-3.1-generate`
`veo-3.1-generate-preview`	Google – Veo	API	$0.20/sec (720p/1080p video); preview alias for veo-3.1
`veo-3.1-fast-generate-001`	Google – Veo	API	$0.10/sec (720p/1080p video); row matched via `veo-3.1-fast-generate`
`veo-3.1-fast-generate-preview`	Google – Veo	API	$0.10/sec; preview alias for veo-3.1-fast
`veo-3.0-generate-001`	Google – Veo	API	$0.20/sec (720p/1080p); row matched via `veo-3.0-generate`
`veo-3.0-generate-preview`	Google – Veo	API	$0.20/sec; preview alias for veo-3.0
`veo-3.0-fast-generate-001`	Google – Veo	API	$0.10/sec; row matched via `veo-3.0-fast-generate`
`veo-3.0-fast-generate-preview`	Google – Veo	API	$0.10/sec; preview alias for veo-3.0-fast
`veo-2.0-generate-001`	Google – Veo	API	$0.50/sec; row matched via `veo-2.0-generate`
`gemini-embedding-001`	Google – Gemini Embedding	API	$0.00015/1K tokens online, $0.00012/1K batch
`gemini-embedding-2-preview`	Google – Gemini Embedding	API	Preview; uses Gemini Embedding pricing
`text-embedding-005`	Google – Text Embedding	API	$0.000025/1K chars (per_thousand_tokens unit)
`text-multilingual-embedding-002`	Google – Text Embedding	API	$0.000025/1K chars
`text-embedding-large-exp-03-07`	Google – Text Embedding	API	Experimental; uses text-embedding pricing $0.000025/1K chars
`textembedding-gecko`	Google – Text Embedding	API	Legacy; uses text-embedding pricing $0.000025/1K chars
`multimodalembedding`	Google – Multimodal Embedding	API	Per-image $0.0001, video-plus $0.0020/sec, standard $0.0010/sec, essential $0.0005/sec
`claude-opus-4-6`	Anthropic – Claude	API	`@default` stripped; input $5, output $25; 5m cache write $6.25, cache hit $0.50
`claude-sonnet-4-6`	Anthropic – Claude	API	`@default` stripped; input $3, output $15; cache write $3.75, cache hit $0.30
`claude-opus-4@20250514`	Anthropic – Claude	API	Input $15, output $75; cache write $18.75, cache hit $1.50
`claude-sonnet-4@20250514`	Anthropic – Claude	API	Input $3, output $15; cache write $3.75, cache hit $0.30
`claude-opus-4-1@20250805`	Anthropic – Claude	API	Input $15, output $75; cache write $18.75, cache hit $1.50
`claude-sonnet-4-5@20250929`	Anthropic – Claude	API	Input $3, output $15; cache write $3.75, cache hit $0.30
`claude-haiku-4-5@20251001`	Anthropic – Claude	API	Input $1, output $5; cache write $1.25, cache hit $0.10
`claude-opus-4-5@20251101`	Anthropic – Claude	API	Input $5, output $25; cache write $6.25, cache hit $0.50
`gpt-oss-120b-maas`	OpenAI	API	Matches "gpt-oss-120b" row; input $0.09, output $0.36
`llama-3.3-70b-instruct-maas`	Meta – Llama	API	Matches "Llama 3.3 70B"; input $0.72, output $0.72
`llama-4-maverick-17b-128e-instruct-maas`	Meta – Llama	API	Matches "Llama 4 Maverick"; input $0.35, output $1.15
`mistral-small-2503`	Mistral AI	API	Matches "Mistral Small 3.1 (25.03)"; input $0.10, output $0.30
`mistral-medium-3`	Mistral AI	API	Input $0.40, output $2.00
`codestral-2`	Mistral AI	API	Input $0.30, output $0.90
`deepseek-r1-0528-maas`	DeepSeek	API	Matches "DeepSeek-R1 (0528)"; input $1.35, output $5.40
`deepseek-v3.1-maas`	DeepSeek	API	Matches "DeepSeek-V3.1"; input $0.60, output $1.70; cache hit $0.06
`deepseek-v3.2-maas`	DeepSeek	API	Matches "DeepSeek-V3.2"; input $0.56, output $1.68; cache hit $0.056
`qwen3-235b-a22b-instruct-2507-maas`	Qwen	API	Matches "Qwen3-235B-A22B-Instruct-2507"; input $0.22, output $0.88
`qwen3-coder-480b-a35b-instruct-maas`	Qwen	API	Matches "Qwen3-Coder-480B-A35B-Instruct"; input $0.22, output $1.80; cache hit $0.022
`qwen3-next-80b-a3b-instruct-maas`	Qwen	API	Matches "Qwen3-Next-80B-Instruct"; input $0.15, output $1.20
`qwen3-next-80b-a3b-thinking-maas`	Qwen	API	Matches "Qwen3-Next-80B-Thinking"; input $0.15, output $1.20
`kimi-k2-thinking-maas`	Kimi / Moonshot	API	Matches "Kimi-K2-Thinking"; input $0.60, output $2.50; cache hit $0.06
`minimax-m2-maas`	MiniMax	API	Matches "MiniMax-M2"; input $0.30, output $1.20; cache hit $0.03
`glm-4.7-maas`	ZAI.org / GLM	API	Matches "GLM-4.7"; input $0.60, output $2.20
`glm-5-maas`	ZAI.org / GLM	API	Matches "GLM-5"; input $1.00, output $3.20; cache hit $0.10

Excluded Models

Model ID	Publisher	Reason
`gemini-live-2.5-flash-native-audio`	Google	Gemini Live streaming — excluded per global rules
`lyria-002`	Google	Music generation — excluded per global rules
`imagegeneration`	Google	Legacy, superseded by imagen-3.0+ — excluded per google.md
`virtual-try-on-001`	Google	Product-specific retail model — excluded per google.md
`pretrained-ocr`	Google	OCR — excluded per global rules
`shieldgemma2`	Google	Safety/guard model — excluded per global rules
`t5gemma`, `paligemma`, `codegemma`, `gemma`, `gemma2`, `gemma3`, `gemma3n`, `functiongemma`, `translategemma`, `embeddinggemma`, `mammut`, `txgemma`, `medgemma`, `medsiglip`, `medasr`, `hear`, `path-foundation`, `derm-foundation`, `cxr-foundation`, `bert-base-uncased`, `timesfm`, `weathernext`	Google	Self-deploy only or non-generative ML — excluded per google.md
`chirp-2`, `chirp-3`	Google	Audio transcription, not generative inference — excluded
`translate-llm`, `text-translation`	Google	Translation, not generative inference
`video-text-detection`, `video-speech-transcription`	Google	Non-generative CV/transcription
`earth-ai-imagery-owlvit-eap-10-2025`, `earth-ai-imagery-mammut-eap-10-2025`	Google	Non-generative vision models
`image-segmentation-001`	Google	Image segmentation — non-generative
Various legacy/CV models	Google	Non-generative: imageclassification-, imageobjectdetection-, bert-base, resnet50, etc.
`clip-vit-base-patch32`, `openclip`	OpenAI	Non-generative (vision embedding/classification)
`whisper-large`	OpenAI	Audio transcription — excluded per openai.md
`gpt-oss`	OpenAI	Self-deploy (has_deploy: true, no -maas suffix)
`faster-r-cnn`, `retinanet`, `mask-r-cnn`, `segment-anything`, `sam3`	Meta	Non-generative CV
`roberta-large`, `xlm-roberta-large`, `nllb`, `imagebind`	Meta	Non-generative NLP/embedding
`llama-guard`, `prompt-guard`	Meta	Safety/guard models
`codellama-7b-hf`, `llama2`, `llama-2-quantized`, `llama3`, `llama3_1`, `llama3-2`, `llama3-3`, `llama4`	Meta	Self-deploy only (has_deploy: true, no -maas)
`mistral`, `mixtral`	Mistral (mistral-ai)	Self-deploy only
`codestral-2501-self-deploy`	Mistral (mistralai)	Self-deploy (name contains self-deploy)
`mistral-ocr-2505`	Mistral	OCR — excluded per global rules
`ministral-3`, `mistral-large-3`	Mistral	Self-deploy only
`deepseek-r1`, `deepseek-v3`, `deepseek-v3-1`, `deepseek-v3-2`	DeepSeek	Self-deploy only
`deepseek-ocr`, `deepseek-ocr-2`, `deepseek-ocr-maas`	DeepSeek	OCR — excluded per global rules
`qwq`, `qwen3`, `qwen3-embedding`, `qwen3-5`, `qwen2`, `qwen3-coder`, `qwen3-coder-next`, `qwen3-next`, `qwen3-vl`	Qwen	Self-deploy only
`qwen-image`	Qwen	Excluded by policy (qwen-image)
`kimi-k2`, `kimi-k2-5`	Kimi	Self-deploy only
`minimax-m2`	MiniMax	Self-deploy only
`glm-4.7`, `glm-5`, `glm-4.5`	ZAI.org	Self-deploy only
`glm-ocr`	ZAI.org	OCR — excluded per global rules
`glm-image`	ZAI.org	Excluded by policy (glm-image)
`jamba-large-1.6`	AI21	Self-deploy only (has_deploy: true, no -maas)

Generated by Pricing Agent on 2026-03-24

siddharthsambharia-portkey added 17 commits March 17, 2026 17:45

chore(pricing): Update vertex-ai pricing

a1a3f5f

chore(pricing): Update vertex-ai pricing

53b3f5d

chore(pricing): Update vertex-ai pricing

52dbf8e

chore(pricing): Update vertex-ai pricing

f19c6a3

chore(pricing): Update vertex-ai pricing

a6e1035

chore(pricing): Update vertex-ai pricing

91c6f2a

chore(pricing): Update vertex-ai pricing

d32f719

chore(pricing): Update vertex-ai pricing

6a7c7e8

chore(pricing): Update vertex-ai pricing

916ddaf

chore(pricing): Update vertex-ai pricing

fa02c68

chore(pricing): Update vertex-ai pricing

7320d33

chore(pricing): Update vertex-ai pricing

3604db1

chore(pricing): Update vertex-ai pricing

d31b801

chore(pricing): Update vertex-ai pricing

a267566

chore(pricing): Update vertex-ai pricing

04933eb

chore(pricing): Update vertex-ai pricing

2dd50e4

chore(pricing): Update vertex-ai pricing

21a3a64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(pricing): Update vertex-ai pricing#550

chore(pricing): Update vertex-ai pricing#550
siddharthsambharia-portkey wants to merge 17 commits intomainfrom
pricing-update/vertex-ai

siddharthsambharia-portkey commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

siddharthsambharia-portkey commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

🔄 Updated Models

Model → Pricing Page Mapping

Excluded Models

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

siddharthsambharia-portkey commented Mar 17, 2026 •

edited

Loading