Gemini 3.1 Flash-Lite launched in preview ($0.25/$1.50 per MTok)

Google introduced Gemini 3.1 Flash-Lite in preview for low-latency, high-volume workloads. The official announcement positions it for summarization, translation, and classification, with pricing set at $0.25 per 1M input tokens and $1.50 per 1M output tokens in Google AI Studio and Vertex AI.

Mar 3, 2026Views:4Effective: Mar 3, 2026

Impact Assessment

Cost4

Reliability1

Migration0

Quality4

Compliance0

What changed:\n- New preview model

gemini-3.1-flash-lite\n- Available in Google AI Studio and Vertex AI\n- Pricing: $0.25 / MTok input, $1.50 / MTok output\n- Positioned for low-latency, high-volume tasks such as summarization, translation, and classification\n\nRecommended actions:\n- Benchmark latency and quality against your current low-cost routing target.\n- Because this is preview, gate production rollout behind evaluation and fallback rules.

Changes

Field	Before	After
model	—	gemini-3.1-flash-lite
status	—	preview
availability	—	Google AI Studio,Vertex AI
input_per_1m	—	0.25
output_per_1m	—	1.5

Recommended Actions

TEST

Benchmark Gemini 3.1 Flash-Lite against your current low-cost model on summarization, translation, classification, and latency-sensitive workloads.

MONITOR

If you route production traffic to this model, keep a fallback target because the launch is preview-only.

Sources

blog

Gemini 3.1 Flash-Lite

Google introduced Gemini 3.1 Flash-Lite in preview for low-latency, high-volume tasks, priced at $0.25 input and $1.50 output per 1M tokens.

Comments

Loading comments...