Open coding-focused Kimi model with 1T total / 32B active parameters, native image/video input, and always-on thinking mode.
MIT-licensed GLM flagship focused on long-horizon coding, agentic engineering, and IndexShare sparse-attention reuse.
Native multimodal 428B/23B-active model with one-million-token context and MiniMax Sparse Attention.
Access suspended three days after launch under an export-control directive citing national security.
OpenAI previewed GPT-5.6, claiming the largest context window of any frontier model.
Available across the Claude API, AWS, and Microsoft Foundry.
Largest Nemotron 3 model appears on NVIDIA NIM with downloadable weights, 1M context, and agentic reasoning positioning.
Lower-cost multimodal sibling of Qwen3.7-Max with text, image, and video input and a 1M-token context.