Gemma 4

Model Variants

Four purpose-built variants from edge devices to workstation-grade performance, all under the Apache 2.0 license.

Dense

Gemma 4 E2B

2B Parameters·128K Context

Ultra-lightweight model optimized for on-device and edge deployments. Delivers strong performance in a compact footprint suitable for mobile and IoT applications.

Use case: Mobile apps, edge devices, IoT, real-time on-device inference
Dense

Gemma 4 E4B

4B Parameters·128K Context

Balanced model offering excellent quality-to-size ratio. Ideal for laptop and desktop deployments where resources are limited but high-quality output is required.

Use case: Laptop inference, desktop assistants, lightweight server deployments
Mixture of Experts (128 Experts)

Gemma 4 26B A4B

26B Total / 4B Active Parameters·256K Context

Sparse Mixture-of-Experts architecture with 128 experts, activating only 4B parameters per inference. Achieves large-model quality with small-model compute cost.

Use case: High-throughput serving, cost-efficient production, multi-tenant APIs
Dense

Gemma 4 31B

31B Parameters·256K Context

Flagship dense model delivering state-of-the-art performance across all benchmarks. Best choice when maximum quality and reasoning depth are the priority.

Use case: Research, complex reasoning, professional content generation, agentic workflows

Model Comparison

E2BE4B26B MoE31B Dense
Parameters2B4B26B (A4B)31B
ArchitectureDenseDenseMoE (128 experts)Dense
Context Length128K128K256K256K
ModalitiesText, Image, AudioText, Image, AudioText, Image, VideoText, Image, Video, Audio

Hardware Recommendations

Find the right hardware configuration for your Gemma 4 deployment based on model variant and use case.

📱

Smartphone / Edge Device

Gemma 4 E2B

💻

Laptop / Desktop

Gemma 4 E4B

🖥️

Desktop GPU

Gemma 4 26B MoE

Workstation / Server

Gemma 4 31B Dense

VRAM Requirements

ModelBF16INT8INT4
Gemma 4 E2B4 GB2.5 GB1.5 GB
Gemma 4 E4B8 GB5 GB3 GB
Gemma 4 26B (MoE)52 GB28 GB16 GB
Gemma 4 31B (Dense)62 GB33 GB18 GB