1,050,000 input tokens
Up to ~1.05M input and 65,536 output tokens—long docs and threads with less manual chunking.
Gemini 3.1 Flash Lite
Built for high-throughput, retry-friendly, cost-sensitive work: run translation backfills, labeling queues, and extraction on Flash Lite in LimaxAI, then escalate edge cases to stronger Gemini models.

Capabilities & limits
Key specs for production planning; exact toggles follow what LimaxAI exposes in chat.
Up to ~1.05M input and 65,536 output tokens—long docs and threads with less manual chunking.
Text, image, video, audio, and PDF in—text out—for extraction and summarization.
Reasoning and schema-following outputs for reliable machine-readable results.
Function calling, code execution, and search grounding (per integration) for light agent steps.
Context caching and batch APIs for repetitive or large workloads (API scenarios; chat per product).
Flash Lite is the economical route in the Gemini family—throughput and price often beat raw quality.
Use cases
Aligned with public Gemini 3.1 Flash Lite positioning; images are illustrative.

Translation backfills, labeling queues, extraction, and first-pass classification as a cheap layer—escalate outliers upstream.

Send text, images, video, audio, or PDFs in one request for long docs and batch content.

Cheap agent substeps, retrieval cleanup, and structured preprocessing in multi-model pipelines (per chat tools).
Why LimaxAI
Same chat workspace as GPT, Claude, and other frontier models—no separate Gemini console.
Route translation, extraction, and classification to Flash Lite; escalate hard cases to Gemini 3.1 Pro or others.
Bill against LimaxAI points rules for straightforward team comparisons.
Same streaming pipeline as other chat models for long replies and iteration.
Gemini family
Flash Lite is the lowest-cost route; upgrade within the family for stronger multimodal or reasoning.
Public specs evolve; available entries follow LimaxAI’s model list.
| Dimension | 3.1 Flash Lite | 3 Flash Preview | 3.1 Pro |
|---|---|---|---|
| Positioning | Low cost · high throughput | Stronger multimodal | Frontier reasoning |
| Context | ~1.05M input | Varies by release | Varies by release |
| Max output | 65K | Varies | Varies |
| Typical tasks | Translate · extract · classify | General Flash | Hard reasoning |
| Pick when | Cost & throughput first | Capability bump | Quality first |
Get started
Try Gemini 3.1 Flash Lite in LimaxAI chat.
Open Chat and pick Gemini 3.1 Flash Lite (or the closest titled entry).
Start with translation, extraction, or short classification prompts; watch latency and quality.
Switch hard cases to Gemini 3.1 Pro and monitor credits on the pricing page.
FAQ
Yes—public materials position Flash Lite as the economical Flash route for high-throughput work where price and throughput often matter more than peak quality.
Public docs cite up to ~1,050,000 input tokens and 65,536 output tokens. LimaxAI limits follow the model list and gateway rules.
Public specs support text, image, video, audio, and PDF inputs with text output—subject to chat attachment capabilities.
API flows often use gemini-3.1-flash-lite-preview. In LimaxAI chat, pick the matching list entry—names may change with configuration.
Stay on Flash Lite for retry-friendly, cost-sensitive translation, extraction, classification, labeling, and document processing; upgrade when quality or difficulty demands it.
Public materials list no image/audio generation, Live API, or Google Maps grounding—best for low-cost text output workflows.
Follow LimaxAI points rules for the selected chat model—see the pricing page and your usage history.
Try Gemini 3.1 Flash Lite in chat
Open Chat, pick Flash Lite, and start with translation, extraction, or classification.