β Catalog
Gemma 4 12B
AvailableGoogle DeepMindOpen source
A dense 12B member of the Gemma 4 family with a unified, encoder-free multimodal architecture: vision and audio are projected straight into the LLM backbone. First medium-size Gemma to natively ingest audio; runs on a 16GB laptop. 256K context, Apache-2.0.
Specifications
- License
- Open source Β· Apache-2.0
- Weights
- Downloadable
- Architecture
- Dense
- Parameters
- 12B
- Context window
- 256K tokens
- Max output
- β
- Knowledge cutoff
- β
- Price (in / out, $/M)
- β
- Modalities
- TextVisionAudioCode
Benchmarks
No benchmark scores recorded yet. Spotted some? Submit a correction.
Vendor-reported figures are claims until independently verified. See methodology.