LLM model catalog

MoE1T262K ctxJun 18, 2026

Moonshot's open coding-focused agentic model built on K2.6, with native vision/video input, forced thinking mode, and stronger long-horizon software-engineering performance.

GLM-5.2

MoE753B1M ctxJun 17, 2026

Z.ai's latest open flagship for long-horizon coding, agentic engineering, and million-token workflows, adding IndexShare sparse-attention reuse over GLM-5.1.

MiniMax-M3

MiniMaxFrontierOpen weights

Native multimodal MiniMax model with a one-million-token context, sparse attention, and agentic coding/cowork positioning.

MoE428B1M ctxJun 16, 2026

GPT-5.6

OpenAIFrontierProprietary

OpenAI's mid-2026 flagship, headlined by an industry-leading 1.5M-token context window and long-horizon agentic tool use.

MoEUndisc.1.5M ctxJun 9, 2026

Claude Fable 5

Withdrawn

The public, guardrailed sibling of Mythos and Anthropic's most capable widely-released model, built for long-horizon agentic work. Launched June 9, 2026 across the Claude API, AWS, and Microsoft Foundry — then pulled three days later under a US government export-control directive barring access by foreign nationals.

—Undisc.— ctxJun 9, 2026

Nemotron 3 Ultra 550B-A55B

NVIDIAFrontierOpen weights

NVIDIA's largest Nemotron 3 open-weight hybrid Mamba-Transformer MoE, tuned for agentic reasoning, coding, planning, and tool calling.

Hybrid550B1M ctxJun 4, 2026

Claude Opus 4.8

—Undisc.500K ctxMay 28, 2026

Anthropic's most capable model, with strengthened agentic and long-running task performance.

MiniMax-M2.7

MiniMaxFrontierOpen weights

Open-weight agentic model from MiniMax focused on real-world software engineering, office tasks, tool use, and self-improving training workflows.

MoE229.9B— ctxMay 26, 2026

Gemini 3.5 Pro

Google DeepMindFrontierProprietary

Announced at Google I/O 2026; emphasizes deep multimodal reasoning over a 2M-token context.

MoEUndisc.2M ctxMay 19, 2026

Qwen3.6-27B

Dense27B256K ctxMay 12, 2026

Dense 27B that punches far above its weight on agentic coding — easy to self-host on a single GPU node.

Grok 4.3

MoEUndisc.1M ctxMay 6, 2026

xAIFrontierProprietary

xAI's agentic flagship with a 1M-token context and aggressive API pricing.

DeepSeek V4-Flash

MoE284B1M ctxApr 24, 2026

Efficient V4 companion model with 284B total / 13B active parameters and the same one-million-token context window.

DeepSeek V4-Pro

MoE1.6T1M ctxApr 24, 2026

Preview-series sparse MoE flagship with a one-million-token context window and 1.6T total / 49B active parameters.

Hunyuan-A13B-Instruct

Tencent HunyuanOpen weights

Tencent Hunyuan open-weight fine-grained MoE model with 80B total parameters and 13B active parameters, optimized for agentic tool use.

MoE80B— ctxApr 22, 2026

GLM-5.1

Z.ai agentic-engineering follow-up to GLM-5, with stronger coding performance and better long-horizon tool-use behavior.

MoE754B— ctxApr 8, 2026

Claude Mythos

A frontier model Anthropic disclosed on April 7, 2026 but declined to release publicly, citing security risk. Shipped only via 'Project Glasswing' to ~50 defensive-security partners, then suspended on June 12, 2026 under a US government directive.

—Undisc.— ctxApr 7, 2026

Gemma 4 31B

Google DeepMindOpen source

Google DeepMind's Gemma 4 advanced-reasoning open model for personal computers, part of the April 2026 Gemma 4 family.

Dense31B— ctxApr 2, 2026

Kimi K2.6

MoE1T256K ctxMar 30, 2026

Moonshot's open native multimodal agentic model for long-horizon coding, visual interface generation, and autonomous tool orchestration.

Mistral Medium 3.5

Dense128B256K ctxMar 18, 2026

Mistral AIOpen weights

Dense 128B open-weight model with a 256k context and strong coding performance for its size.

Nemotron 3 Super 120B-A12B

NVIDIAFrontierOpen weights

Open-weight hybrid Mamba-Transformer MoE designed for collaborative agents and high-volume enterprise workflows.

Hybrid120B1M ctxMar 16, 2026

Step-3.5-Flash

MoE196B256K ctxMar 14, 2026

StepFunOpen source

StepFun's Apache-licensed sparse MoE model for fast agentic execution, coding, math, browsing, and tool-use workflows.

Sarvam-105B

MoE105B128K ctxMar 6, 2026

Sarvam AIOpen source

Apache-licensed Indian-context MoE from Sarvam AI, optimized for reasoning, coding, agentic tasks, and 22 Indian languages.

GPT-5.4

OpenAIFrontierProprietary

Workhorse GPT-5 release with a dedicated Thinking mode; widely deployed across ChatGPT and the API.

MoEUndisc.400K ctxMar 5, 2026

Qwen3.5-397B

Alibaba (Qwen)FrontierOpen source

Native vision-language MoE supporting 201 languages with a 1M-token context.

MoE397B1M ctxFeb 20, 2026

Gemini 3.1 Pro

Google DeepMindFrontierProprietary

Generally available multimodal flagship with native tool use and a 2M-token context.

MoEUndisc.2M ctxFeb 19, 2026

GLM-5

Z.ai flagship for complex systems engineering and long-horizon agentic tasks, scaling the GLM line to 744B total / 40B active parameters.

MoE744B— ctxFeb 11, 2026

Claude Opus 4.6

—Undisc.200K ctxFeb 5, 2026

Introduced genuinely autonomous multi-file coding and stronger computer use.

Qwen3-Coder-Next

Hybrid80B262K ctxFeb 3, 2026

Apache-licensed Qwen3-Next coding-agent model with 80B total / 3B active parameters, 256K context, and long-horizon tool-use training.

Kimi K2.5

MoE1T256K ctxJan 27, 2026

Open multimodal Kimi model that adds native visual agentic intelligence, instant and thinking modes, and agent-swarm workflows on top of the K2 base.

GLM-4.7

Coding-focused GLM release with improved multilingual agentic coding, terminal tasks, tool use, and interface generation.

MoE358B— ctxJan 8, 2026

OLMo 3 Think 32B

Dense32B— ctxDec 15, 2025

Ai2's fully open thinking model with public weights, code, data, checkpoints, and training details across the OLMo 3 pipeline.

Nemotron 3 Nano 30B-A3B

Hybrid30B1M ctxDec 15, 2025

Efficient Nemotron 3 MoE checkpoint for agentic reasoning and coding, activating about 3B parameters while supporting 1M-token contexts.

GLM-4.6V

Z.ai (Zhipu AI)Open source

Open 106B-class vision-language model with native multimodal function calling for visual agents.

MoE106B128K ctxDec 8, 2025

Mistral Large 3

Mistral AIFrontierOpen weights

Mistral's largest open-weight MoE, aimed at frontier reasoning while remaining self-hostable.

MoE675B256K ctxDec 2, 2025

DeepSeek-V3.2

MoE685B128K ctxDec 1, 2025

Reasoning-first agent model that adds DeepSeek Sparse Attention and thinking directly inside tool-use workflows.

DeepSeek-V3.2-Speciale

MoE685B128K ctxDec 1, 2025

High-compute reasoning variant of V3.2, positioned for olympiad-level math, programming, and other deep reasoning tasks.

LFM2 1.2B

Hybrid1.17B33K ctxNov 28, 2025

Liquid AIOpen weights

Liquid AI hybrid model for efficient CPU/GPU/NPU local deployment, using short convolutions plus attention blocks.

Kimi K2 Thinking

Open K2 reasoning-agent variant that interleaves step-by-step thinking with tool calls and supports stable 200-300 step tool-use trajectories.

MoE1T256K ctxNov 6, 2025

Kimi-Linear-48B-A3B-Instruct

Hybrid48B1.0M ctxOct 31, 2025

MIT-licensed hybrid linear-attention model using Kimi Delta Attention, built for million-token contexts with much lower KV-cache usage.

GLM-4.6

MoE357B200K ctxSep 30, 2025

Agentic reasoning and coding upgrade over GLM-4.5, expanding the text context window from 128K to 200K tokens.

DeepSeek-V3.2-Exp

MoE685B128K ctxSep 29, 2025

Experimental checkpoint that introduced DeepSeek Sparse Attention as an efficiency bridge between V3.1-Terminus and V3.2.

DeepSeek-V3.1-Terminus

MoE685B128K ctxSep 22, 2025

Stability update to V3.1 focused on language consistency, code-agent reliability, and search-agent behavior.

Kimi K2 Instruct 0905

September 2025 K2 update with stronger agentic coding, better frontend generation, and a doubled 256K context window.

MoE1T256K ctxSep 5, 2025

Gemma 3 27B

Dense27B128K ctxSep 4, 2025

Google's open multimodal model: 128k context, 140+ languages, runs on a single GPU.

DeepSeek-V3.1

MoE671B128K ctxAug 21, 2025

Hybrid thinking/non-thinking release that upgraded tool calling, long-context training, and agent task performance.

Seed-OSS-36B-Instruct

ByteDance SeedOpen source

ByteDance Seed's Apache-licensed long-context reasoning and agent model, with controllable thinking budgets and a native 512K context.

Dense36B512K ctxAug 20, 2025

GLM-4.5V

Z.ai (Zhipu AI)Open source

Vision-language GLM based on GLM-4.5-Air, covering image, video, document, grounding, and GUI-agent tasks.

MoE106B— ctxAug 11, 2025

gpt-oss-20b

MoE21B128K ctxAug 5, 2025

OpenAIOpen source

Smaller gpt-oss reasoning model optimized for local inference on systems with about 16GB of memory.

gpt-oss-120b

MoE117B128K ctxAug 5, 2025

OpenAIOpen source

OpenAI's larger open-weight reasoning model, a 117B-total / 5.1B-active MoE with 128K context for local and self-hosted deployment.

Falcon-H1 34B

Hybrid34B256K ctxJul 31, 2025

A hybrid attention + state-space-model (SSM) design that matches 70B-class models with fewer parameters.

GLM-4.5

MoE355B128K ctxJul 28, 2025

Open agentic, reasoning, and coding foundation model that marked Z.ai international rebrand and MIT-licensed GLM push.

GLM-4.5-Air

Z.ai (Zhipu AI)Open source

Compact GLM-4.5 companion with 106B total / 12B active parameters for efficient agentic reasoning and coding.

MoE106B128K ctxJul 28, 2025

EXAONE 4.0 32B

LG AI ResearchOpen weights

LG AI Research's unified model with non-reasoning and reasoning modes, agentic tool use, and English, Korean, and Spanish support.

Dense32B— ctxJul 15, 2025

Kimi K2 Instruct

MoE1T128K ctxJul 11, 2025

Original open K2 post-trained model: a 1T-parameter MoE optimized for coding, reasoning, and tool-using agentic workflows.

Grok 4

xAI's fourth-generation Grok line, preceding the later 4.x API updates already tracked in the catalog.

—Undisc.— ctxJul 9, 2025

SmolLM3 3B

Dense3B128K ctxJul 8, 2025

Hugging FaceOpen source

Hugging Face's fully open 3B multilingual long-context model with optional reasoning mode and 128K context.

ERNIE-4.5-300B-A47B

MoE300B128K ctxJun 30, 2025

BaiduOpen source

Baidu's open ERNIE 4.5 language MoE, part of a 10-variant Apache-licensed model family built with heterogeneous multimodal MoE training.

ERNIE-4.5-VL-424B-A47B

MoE424B128K ctxJun 30, 2025

BaiduOpen source

Baidu's largest ERNIE 4.5 vision-language MoE, supporting text, image, and video inputs with thinking and non-thinking modes.

Kimi-VL-A3B-Thinking-2506

MoE16B128K ctxJun 21, 2025

Updated MIT-licensed Kimi-VL reasoning model with better multimodal reasoning, video understanding, high-resolution perception, and lower thinking-token use.

Kimi-Dev-72B

Dense73B— ctxJun 17, 2025

MIT-licensed coding LLM trained with repository-level reinforcement learning for software issue resolution.

MiniMax-M1-80k

MiniMaxFrontierOpen source

Open Apache-licensed hybrid-attention reasoning model with 456B total / 45.9B active parameters and a native 1M-token context.

Hybrid456B1M ctxJun 16, 2025

Magistral Medium

—Undisc.— ctxJun 10, 2025

Mistral AIProprietary

Mistral's first dedicated reasoning model family, released in Small open-weight and Medium enterprise/API tiers.

Magistral Small

Dense24B40K ctxJun 10, 2025

Mistral AIOpen weights

Open-weight 24B reasoning model from Mistral's Magistral family, popular for local reasoning experiments.

DeepSeek-R1-0528

MoE671B128K ctxMay 28, 2025

Major R1 reasoning update with stronger math, programming, general logic, function calling, and reduced hallucinations.

Claude Opus 4

—Undisc.200K ctxMay 22, 2025

First Claude 4 Opus model, positioned for long-running agentic and coding work before the 4.x point releases.

Seed Thinking v1.5

ByteDance SeedProprietary

ByteDance Seed reasoning model focused on long-horizon thinking and problem solving.

—Undisc.— ctxMay 22, 2025

Sarvam-M

DenseUndisc.— ctxMay 21, 2025

Sarvam AIOpen weights

Sarvam's medium-scale open model for multilingual Indian-language chat, reasoning, and translation tasks.

Phi-4 Reasoning

Dense14B— ctxApr 30, 2025

Phi-4 reasoning-specialized model family for math, science, and chain-of-thought style tasks.

Granite 3.3 8B

Dense8B128K ctxApr 30, 2025

Granite 3.3 text update for enterprise chat, RAG, and instruction-following workflows.

Qwen3-235B-A22B

MoE235B128K ctxApr 28, 2025

Largest open Qwen3 MoE, introducing hybrid thinking/non-thinking modes and 119-language coverage.

Kimi-Audio-7B-Instruct

Hybrid10B— ctxApr 25, 2025

Open audio foundation model for audio understanding, generation, speech recognition, audio QA, captioning, and speech conversation.

Kimi-VL-A3B-Instruct

MoE16B128K ctxApr 17, 2025

Efficient MIT-licensed vision-language MoE for OCR, image/video understanding, long documents, and OS-style agent tasks.

OpenAI o3

—Undisc.— ctxApr 16, 2025

Reasoning model released alongside o4-mini with tool use, image reasoning, and stronger agentic problem solving.

GPT-4.1

—Undisc.1M ctxApr 14, 2025

API model family focused on coding, instruction following, and one-million-token long-context work.

Llama 4 Maverick

Meta AIFrontierOpen weights

Meta's flagship open-weight MoE; highest MMLU among open models at release.

MoE400B1M ctxApr 5, 2025

Llama 4 Scout

MoE109B10M ctxApr 5, 2025

Efficient open-weight MoE designed for very long context on modest hardware.

Llama-3.3-Nemotron-Super-49B

Dense49B128K ctxApr 2, 2025

Open Llama Nemotron reasoning model from NVIDIA's 2025 Nemotron family.

Qwen2.5-Omni-7B

Local omni-modal Qwen model that supports text, image, audio, video, and speech generation in a 7B package.

Dense7B— ctxMar 26, 2025

DeepSeek-V3-0324

MoE671B128K ctxMar 25, 2025

Post-R1 V3 update with improved reasoning, front-end coding, Chinese writing, search, and function calling.

Gemini 2.5 Pro

—Undisc.1M ctxMar 25, 2025

Reasoning-focused Gemini 2.5 model that made thinking a core part of Google's flagship model line.

Mistral Small 3.1

Dense24B128K ctxMar 17, 2025

Apache-licensed Small update adding vision and a 128K context window to the efficient 24B line.

ERNIE X1

—Undisc.— ctxMar 16, 2025

Baidu's reasoning model released alongside ERNIE 4.5 before the open ERNIE 4.5 weights.

OLMo 2 32B

Dense32B4K ctxMar 13, 2025

A fully open model — weights, data, and training code all public — and the first such to beat GPT-3.5 / GPT-4o mini.

Command A

Dense111B256K ctxMar 13, 2025

CohereOpen weights

Enterprise-grade model tuned for RAG, tool use, and multilingual business workloads.

Granite 3.2 8B

Dense8B128K ctxFeb 26, 2025

Granite 3.2 update with reasoning controls and multimodal/document-oriented Granite variants.

Claude 3.7 Sonnet

—Undisc.200K ctxFeb 24, 2025

Anthropic's first hybrid-reasoning Sonnet. Shut down May 11, 2026 as the 4.x line matured.

Moonlight-16B-A3B-Instruct

MIT-licensed 16B/3B-active MoE trained with Moonshot's scalable Muon optimizer experiments.

MoE16B8K ctxFeb 24, 2025

DeepHermes 3 Llama 3 8B

Nous ResearchOpen weights

Nous reasoning-oriented Hermes model trained to combine concise answers with optional deep reasoning traces.

Dense8B8K ctxFeb 18, 2025

Grok 3

—Undisc.— ctxFeb 17, 2025

xAI's third-generation model family, introduced with stronger reasoning, search, and coding modes.

Dolphin 3.0 Llama 3.1 8B

Cognitive ComputationsOpen weights

Popular local assistant model tuned for coding, math, function calling, and agentic workflows.

Dense8B128K ctxFeb 2, 2025

Mistral Small 3

Dense24B32K ctxJan 30, 2025

A latency-optimized 24B dense model under Apache-2.0 — a popular local-deployment workhorse.

Qwen2.5-Max

Alibaba (Qwen)Proprietary

Proprietary MoE flagship for the Qwen2.5 generation, released through Qwen Chat and Alibaba Cloud APIs.

MoEUndisc.— ctxJan 29, 2025

Qwen2.5-VL-72B

Dense72B128K ctxJan 26, 2025

Vision-language Qwen2.5 model for image, document, video, and agentic visual grounding tasks.

Doubao-1.5-pro

ByteDance SeedProprietary

Doubao 1.5 Pro update positioned for stronger multimodal, reasoning, and agentic work in Volcano Engine.

—Undisc.— ctxJan 22, 2025

DeepSeek-R1

MoE671B128K ctxJan 20, 2025

Breakout open reasoning model trained with large-scale reinforcement learning and released with weights under MIT.

Kimi k1.5

—Undisc.— ctxJan 20, 2025

Moonshot AIProprietary

Moonshot's multimodal reinforcement-learning reasoning model, reported as matching OpenAI o1 on math, coding, and multimodal reasoning.

MiniMax-01

Hybrid456B4M ctxJan 15, 2025

MiniMaxOpen weights

Open MiniMax generation with MiniMax-Text-01 and MiniMax-VL-01 long-context models.

DeepSeek-V3

MoE671B128K ctxDec 26, 2024

The 671B/37B-active MoE release that made DeepSeek a central open-model lab before the R1 breakthrough.

Step-2

—Undisc.— ctxDec 23, 2024

StepFunProprietary

Second-generation StepFun foundation model line with larger-scale multimodal and reasoning ambitions.

Granite 3.1 8B

Dense8B128K ctxDec 18, 2024

IBM's enterprise-focused open model with a 128k context, Apache-2.0 licensed.

Falcon 3 10B

Dense10B32K ctxDec 17, 2024

UAE's TII open model designed to run on light infrastructure, including laptops.

Command R7B

Dense8B128K ctxDec 13, 2024

CohereOpen weights

Cohere's smallest, fastest R-series model, tuned for RAG and tool use on modest hardware.

Phi-4

Dense14B16K ctxDec 12, 2024

MicrosoftOpen source

A 14B dense model that rivals far larger ones on math and reasoning, under a permissive MIT license.

Gemini 2.0 Flash

—Undisc.1M ctxDec 11, 2024

First Gemini 2.0 release, built for native multimodal input/output, tool use, and agentic product integrations.

EXAONE 3.5 32B

LG AI ResearchOpen weights

EXAONE 3.5 32B open-weight model for bilingual reasoning, coding, and long-context tasks.

Dense32B32K ctxDec 9, 2024

Llama 3.3 70B

Dense70B128K ctxDec 6, 2024

Late-2024 70B Llama update delivering much of the 405B instruction-following quality at lower serving cost.

OpenAI o1

General release of OpenAI's o1 reasoning model with stronger deliberative reasoning and multimodal ChatGPT integration.

—Undisc.— ctxDec 5, 2024

Amazon Nova Pro

—Undisc.300K ctxDec 3, 2024

AWS-native multimodal model with a 300k context; size and architecture undisclosed.

Amazon Nova Lite

—Undisc.300K ctxDec 3, 2024

Lower-cost multimodal Nova understanding model for text, image, and video inputs.

QwQ-32B-Preview

Dense32B32K ctxNov 28, 2024

Qwen's first public reasoning-preview model, aimed at math, coding, and deliberate problem solving.

Tulu 3 405B

Allen Institute for AI (Ai2)Open weights

Ai2's post-trained open instruction model line, scaling the Tulu recipe to Llama 3.1 405B.

Dense405B128K ctxNov 21, 2024

DeepSeek-R1-Lite-Preview

—Undisc.— ctxNov 20, 2024

DeepSeekProprietary

Reasoning-preview model exposed in DeepSeek Chat ahead of the open DeepSeek-R1 release.

Qwen2.5-Coder-32B

Dense32B128K ctxNov 12, 2024

Code-specialized Qwen2.5 model family, with the 32B checkpoint as the flagship open coding model.

Hunyuan-Large

Tencent HunyuanOpen weights

Tencent's 389B total / 52B active open-weight Transformer MoE, released with a 256K pretraining context and 128K instruct context.

MoE389B128K ctxNov 4, 2024

SmolLM2 1.7B

Dense1.7B— ctxNov 4, 2024

Hugging FaceOpen source

Compact on-device model family trained on 11T tokens, popular for lightweight local chat and experimentation.

Claude 3.5 Haiku

—Undisc.200K ctxOct 22, 2024

Fast, lower-cost Claude 3.5 model for latency-sensitive coding, tool-use, and customer-facing workloads.

Sarvam-1

Sarvam AIOpen weights

Sarvam's 2B open model trained for ten major Indian languages.

Dense2B— ctxOct 22, 2024

Granite 3.0 8B

Dense8B4K ctxOct 21, 2024

Apache-licensed Granite 3.0 text model, part of IBM's push toward enterprise-friendly open models.

Yi-Lightning

MoEUndisc.— ctxOct 16, 2024

01.AIProprietary

01.AI's MoE API model that reached the global top-10 on Chatbot Arena, strong in Chinese, math, and coding.

Ministral 8B

Dense8B128K ctxOct 16, 2024

Mistral AIProprietary

Small Mistral model line optimized for edge and low-latency workloads.

Llama-3.1-Nemotron-70B

Dense70B128K ctxOct 15, 2024

NVIDIA-tuned Llama 3.1 70B instruction model optimized with Nemotron reward and alignment recipes.

Llama 3.2 90B Vision

Dense90B128K ctxSep 25, 2024

First Llama family release with native vision models, alongside smaller edge-oriented 1B and 3B text models.

Molmo 72B

Allen Institute for AI (Ai2)Open weights

Open multimodal model family trained for strong image understanding, pointing, and visual grounding.

Dense72B— ctxSep 25, 2024

Qwen2.5-72B

Dense72B128K ctxSep 19, 2024

Broad Qwen2.5 foundation-model update spanning general, coding, math, and multimodal descendants.

Pixtral 12B

Dense12B128K ctxSep 17, 2024

Mistral's first open multimodal model, adding image understanding to a Mistral text backbone.

OpenAI o1-preview

—Undisc.— ctxSep 12, 2024

OpenAI's first public reasoning-model preview, optimized to spend more inference time on hard math, coding, and science tasks.

Yi-Coder-9B

Dense9B128K ctxSep 5, 2024

01.AIOpen weights

01.AI's compact code model trained for repository-scale programming and code completion tasks.

DeepSeek-V2.5

MoE236B128K ctxSep 5, 2024

Unified DeepSeek V2 generation combining general-chat and coding strengths before the V3 series.

Hunyuan Turbo

Tencent HunyuanProprietary

Tencent's faster, lower-cost Hunyuan update before the open Hunyuan-Large model card.

—Undisc.— ctxSep 5, 2024

OLMoE 1B-7B

Fully open sparse MoE model with 7B total and about 1B active parameters.

MoE7B— ctxSep 3, 2024

Jamba 1.5 Large

Hybrid398B256K ctxAug 22, 2024

AI21 LabsOpen weights

Israel's AI21 hybrid Mamba-Transformer MoE, with a 256k context and strong long-document throughput.

Phi-3.5 MoE

MoE42B128K ctxAug 20, 2024

Phi-3.5 mixture-of-experts model, scaling Microsoft's small-model line while preserving efficient active parameters.

Hermes 3 Llama 3.1 405B

Nous ResearchOpen weights

Large Hermes 3 instruction-tuned model built on Meta's Llama 3.1 405B.

Dense405B128K ctxAug 15, 2024

Grok-2

—Undisc.— ctxAug 13, 2024

Second-generation Grok release with Grok-2 and Grok-2 mini for chat, coding, reasoning, and image-enabled product experiences.

EXAONE 3.0 7.8B

LG AI ResearchOpen weights

LG's first open-weight EXAONE model, a compact bilingual instruction model for Korean and English.

Dense7.8B— ctxAug 7, 2024

MiniCPM-V 2.6

OpenBMBOpen weights

8B vision-language model for local image, multi-image, OCR, and video understanding, with llama.cpp and Ollama support.

Dense8B— ctxAug 2, 2024

Llama 3.1 405B

Dense405B128K ctxJul 23, 2024

Meta's first frontier-scale open Llama model, with 405B parameters, 128K context, multilingual support, and tool-use improvements.

Mistral NeMo

Dense12B128K ctxJul 18, 2024

Apache-licensed 12B model co-developed with NVIDIA, including a 128K context window and strong multilingual tokenization.

Gemma 2 27B

Dense27B8K ctxJun 27, 2024

Second-generation Gemma model, improving open-weight quality and efficiency at 9B and 27B sizes.

Claude 3.5 Sonnet

—Undisc.200K ctxJun 20, 2024

Major Sonnet upgrade that became Anthropic's default high-intelligence workhorse for coding, writing, and visual reasoning.

DeepSeek-Coder-V2

MoE236B128K ctxJun 17, 2024

Open code-focused MoE built from DeepSeek-V2, expanding programming-language coverage and coding benchmark performance.

Nemotron-4 340B

Dense340B4K ctxJun 14, 2024

NVIDIA's large open model family for synthetic data generation and reward modeling.

Qwen2-72B

Dense72B128K ctxJun 7, 2024

Qwen2's largest dense model, introducing stronger multilingual support, coding/math gains, and long-context variants.

GLM-4-9B

Z.ai (Zhipu AI)Open weights

Open GLM-4 9B model family, covering chat, long-context, and code-oriented variants.

Dense9B128K ctxJun 5, 2024

Codestral 22B

Dense22B32K ctxMay 29, 2024

Mistral AIOpen weights

Mistral's first code-specialized model, trained for code generation, fill-in-the-middle, and multi-language programming tasks.

Aya 23 35B

Dense35B— ctxMay 23, 2024

CohereOpen weights

Open multilingual research model covering 23 languages, released by Cohere For AI.

Doubao-pro

ByteDance SeedProprietary

ByteDance's commercial Doubao foundation model line for text, code, and assistant workloads.

—Undisc.— ctxMay 15, 2024

GPT-4o

—Undisc.128K ctxMay 13, 2024

The 2024 omni-modal model that defined a generation of assistants. Deprecated in Feb 2026 and fully retired across ChatGPT on April 3, 2026.

Yi-1.5-34B

Dense34B4K ctxMay 13, 2024

01.AIOpen weights

Yi 1.5 update with stronger instruction following, coding, math, and multilingual performance.

Falcon 2 11B

Dense11B8K ctxMay 13, 2024

Falcon 2 generation, including text and vision-language 11B models under a permissive TII license.

DeepSeek-V2

MoE236B128K ctxMay 7, 2024

DeepSeek's first major MoE general model with Multi-head Latent Attention and low-cost API positioning.

Granite Code 34B

Dense34B8K ctxMay 6, 2024

Apache-2.0 code model from IBM's Granite Code family, used for local code generation and enterprise coding assistants.

Amazon Titan Text Premier

—Undisc.— ctxApr 30, 2024

Larger Titan text model for enterprise RAG, summarization, and agent workflows in Amazon Bedrock.

Snowflake Arctic

Snowflake AI ResearchOpen source

Apache-2.0 enterprise LLM with 480B total / 17B active parameters, optimized for SQL, code, and instruction following.

MoE480B— ctxApr 24, 2024

Phi-3 Mini

Dense3.8B128K ctxApr 23, 2024

3.8B-parameter Phi-3 model released as a phone-capable small model with 4K and 128K variants.

Llama 3 70B

Dense70B8K ctxApr 18, 2024

First Llama 3 release, with 8B and 70B open models and a stronger tokenizer, data mix, and post-training stack.

Mixtral 8x22B

MoE141B64K ctxApr 17, 2024

Larger open Mixtral sparse MoE with 141B total and 39B active parameters, released under Apache-2.0.

abab6.5

—Undisc.1M ctxApr 17, 2024

MiniMaxProprietary

MiniMax's commercial long-context abab model generation before the open MiniMax-01 and M series.

WizardLM-2 8x22B

MoE141B66K ctxApr 15, 2024

Microsoft's WizardLM-2 MoE chat model, widely mirrored and run locally after its model-card release.

Step-1V

—Undisc.— ctxApr 12, 2024

StepFunProprietary

StepFun's first major vision-language model, released after the Step-1 language model.

CodeGemma 7B

Open code-specialized Gemma model for local code completion, generation, and instruction-following.

Dense7B8K ctxApr 9, 2024

Command R+

—Undisc.128K ctxApr 4, 2024

CohereProprietary

Higher-capability RAG and tool-use model in Cohere's Command R family.

Grok-1.5

—Undisc.128K ctxMar 28, 2024

Grok update with stronger reasoning and a 128K context window.

Jamba

Hybrid52B256K ctxMar 28, 2024

AI21 LabsOpen weights

First Jamba hybrid Transformer-Mamba MoE model with open weights and a 256K context length.

DBRX Instruct

Databricks / MosaicMLOpen weights

Databricks' 132B-total / 36B-active open MoE model for code, math, RAG, and enterprise self-hosted workloads.

MoE132B32K ctxMar 27, 2024

Step-1

—Undisc.— ctxMar 23, 2024

StepFunProprietary

StepFun's first public foundation model generation, introduced as a trillion-parameter Chinese model line.

Kimi 1M

—Undisc.— ctxMar 18, 2024

Moonshot AIProprietary

Long-context Kimi upgrade advertised with support for million-character document and conversation contexts.

Command R

—Undisc.128K ctxMar 11, 2024

CohereProprietary

Enterprise RAG-focused model with tool use, citations, multilingual retrieval, and long-context support.

Claude 3 Opus

—Undisc.200K ctxMar 4, 2024

Highest-capability Claude 3 model, launched with Sonnet and Haiku and Anthropic's first major vision-capable Claude family.

StarCoder2 15B

Dense16B16K ctxFeb 28, 2024

BigCodeOpen weights

Next-generation BigCode code model trained on 4T+ tokens and 600+ programming languages, with 16K context.

Mistral Large

—Undisc.32K ctxFeb 26, 2024

Mistral AIProprietary

Mistral's first proprietary flagship API model, introduced alongside Le Chat and stronger multilingual/coding performance.

Gemma 7B

Dense7B8K ctxFeb 21, 2024

First Gemma open-weight text model family, derived from the same research lineage as Gemini.

Gemini 1.5 Pro

MoEUndisc.2M ctxFeb 15, 2024

Gemini generation that introduced production-scale long context, eventually expanding to a two-million-token window.

Qwen1.5-110B

Dense110B32K ctxFeb 5, 2024

Largest Qwen1.5 model, released as the bridge from the original Qwen line to Qwen2.

OLMo 7B

Ai2's first fully open language model release, including weights, training data, code, logs, and intermediate checkpoints.

Dense7B4K ctxFeb 1, 2024

Stable LM 2 1.6B

Dense1.6B— ctxJan 19, 2024

Stability AIOpen weights

Small multilingual Stable LM release built for low hardware barriers and local experimentation.

GLM-4

Z.ai (Zhipu AI)Proprietary

Zhipu's GLM-4 flagship generation, launched as the successor to ChatGLM3 with stronger tool use and multimodal variants.

—Undisc.128K ctxJan 16, 2024

DeepSeekMoE 16B

Early DeepSeek sparse MoE research model that foreshadowed the later V2/V3 architecture direction.

MoE16B4K ctxJan 11, 2024

Nous Hermes 2 Mixtral

MoE47B32K ctxJan 11, 2024

Nous ResearchOpen source

Nous instruction-tuned Mixtral model with strong open-chat and tool-use adoption.

OpenChat 3.5

OpenChatOpen source

Compact Mistral-based local chat model trained with C-RLFT, popular in early 2024 local leaderboards.

Dense7B— ctxJan 6, 2024

TinyLlama 1.1B Chat

Dense1.1B— ctxJan 1, 2024

TinyLlamaOpen source

Compact Llama-style 1.1B chat model trained for local experimentation and low-memory deployments.

Phi-2

Dense2.7B— ctxDec 12, 2023

2.7B-parameter Phi model showing strong reasoning and language understanding at small scale.

OpenHathi-7B

Sarvam AIOpen weights

Sarvam AI's first open Indic language model, adapted from Llama 2 for Hindi and Indian-language work.

Dense7B— ctxDec 12, 2023

Mixtral 8x7B

MoE47B32K ctxDec 11, 2023

The open sparse Mixture-of-Experts that brought MoE efficiency to the open ecosystem.

Gemini 1.0 Ultra

—Undisc.32K ctxDec 6, 2023

Google's first natively multimodal Gemini flagship, since superseded by the 1.5/2/3 lines.

Qwen-72B

Dense72B32K ctxNov 30, 2023

Alibaba's first major open Qwen model and the start of a prolific open-weight line.

DeepSeek LLM 67B

Dense67B4K ctxNov 29, 2023

First general DeepSeek language model family, with 7B and 67B base/chat checkpoints.

Claude 2.1

—Undisc.200K ctxNov 21, 2023

Claude update with a 200K context window, lower hallucination rates, and improved tool-use beta support.

Yi-34B

Dense34B200K ctxNov 6, 2023

01.AIOpen weights

01.AI's strong bilingual open model, with a 200k-context variant.

GPT-4 Turbo

—Undisc.128K ctxNov 6, 2023

Lower-cost GPT-4 generation with a 128K context window, introduced at OpenAI DevDay.

Grok-1

xAIOpen source

xAI's first Grok model, later released as open weights with a 314B-parameter MoE checkpoint.

MoE314B— ctxNov 4, 2023

DeepSeek Coder 33B

Dense33B16K ctxNov 2, 2023

DeepSeek's first public code-model family, released before the general DeepSeek LLM line.

ERNIE 4.0

—Undisc.— ctxOct 17, 2023

Baidu's fourth-generation ERNIE flagship, announced with stronger understanding, generation, reasoning, and memory.

Kimi Chat

Moonshot AIProprietary

Moonshot's first Kimi assistant release, establishing the long-context product line before the open Kimi model cards.

—Undisc.— ctxOct 9, 2023

LLaVA 1.5 13B

Hybrid13B— ctxSep 30, 2023

LLaVAOpen weights

Open vision-language assistant and one of the most widely run early local multimodal models.

Amazon Titan Text Express

—Undisc.— ctxSep 28, 2023

Amazon's first-party Titan text generation model exposed through Bedrock, initially alongside embeddings and image models.

Mistral 7B

Dense7B8K ctxSep 27, 2023

The 7B that punched far above its weight and put Mistral on the map.

Qwen-14B

Dense14B8K ctxSep 25, 2023

Second open Qwen size, expanding the first-generation Qwen language-model lineup.

Granite 13B

IBMOpen weights

IBM's early Granite foundation model family for enterprise language and code tasks.

Dense13B— ctxSep 7, 2023

Hunyuan

Tencent HunyuanProprietary

Tencent's first Hunyuan foundation model release, introduced as a general-purpose Chinese enterprise model.

—Undisc.— ctxSep 7, 2023

Falcon 180B

Dense180B2K ctxSep 6, 2023

At launch the largest openly available model, from the UAE's TII.

Code Llama 34B

Dense34B16K ctxAug 24, 2023

Meta's first code-specialized Llama model family, released in base, Python, and instruction-tuned variants.

Qwen-7B

Dense7B32K ctxAug 3, 2023

Alibaba's first open Qwen checkpoint and the start of the Qwen open-model line.

Nous-Hermes-Llama2-13B

Nous ResearchOpen weights

Early Nous Hermes instruction model on Llama 2, widely used in the open-model fine-tuning ecosystem.

Dense13B4K ctxJul 24, 2023

EXAONE 2.0

LG AI ResearchProprietary

Second EXAONE generation, improving bilingual Korean-English performance and enterprise deployment options.

—Undisc.— ctxJul 19, 2023

Llama 2 70B

Dense70B4K ctxJul 18, 2023

The release that made capable open-weight models genuinely usable for production.

Claude 2

—Undisc.100K ctxJul 11, 2023

Anthropic's first widely-available Claude, notable for an early 100k-token context window.

ChatGLM2-6B

Z.ai (Zhipu AI)Open weights

Second open ChatGLM generation, improving long context, inference efficiency, and bilingual chat quality.

Dense6B32K ctxJun 25, 2023

Phi-1

Dense1.3B— ctxJun 21, 2023

Microsoft's first Phi small-language-model release, demonstrating strong code performance from textbook-quality synthetic data.

Falcon 40B

Dense40B2K ctxMay 25, 2023

TII's breakout open Falcon model, released before Falcon 180B and trained on the RefinedWeb corpus.

PaLM 2

DenseUndisc.— ctxMay 10, 2023

Google's improved multilingual, reasoning, and coding foundation model family introduced at I/O 2023.

MPT-7B

Databricks / MosaicMLOpen source

MosaicML's permissively licensed 7B model, an early favorite for commercial local fine-tuning and long-context variants.

Dense7B2K ctxMay 5, 2023

Vicuna 13B

LMSYS / SkyLabOpen weights

LMSYS instruction-tuned LLaMA model that became a landmark early local ChatGPT-style assistant.

Dense13B— ctxMar 30, 2023

ERNIE Bot

—Undisc.— ctxMar 16, 2023

Baidu's public chat assistant launch, built on the ERNIE foundation-model line.

GPT-4

—Undisc.8K ctxMar 14, 2023

The model that brought reliable multi-step reasoning to the mainstream; size never disclosed.

ChatGLM-6B

Z.ai (Zhipu AI)Open weights

Zhipu AI and Tsinghua KEG's first widely used open bilingual ChatGLM checkpoint.

Dense6B2K ctxMar 14, 2023

Claude 1

—Undisc.— ctxMar 14, 2023

Anthropic's first broadly announced Claude assistant model, launched through an API and select product partners.

Jurassic-2 Ultra

AI21 LabsProprietary

Second-generation Jurassic model with better multilingual support, lower latency, and instruction following.

—Undisc.— ctxMar 9, 2023

GPT-3.5 Turbo

—Undisc.4K ctxMar 1, 2023

OpenAI's first ChatGPT API model, bringing the GPT-3.5 chat-tuned line to developers at much lower cost than text-davinci-003.

LLaMA

Dense65B2K ctxFeb 24, 2023

Meta's first LLaMA, released to researchers; its leak catalyzed the open-weight movement.

Galactica

Withdrawn

Dense120B2K ctxNov 15, 2022

A science-focused model whose public demo was withdrawn after just three days over confidently wrong outputs — an early, instructive retraction.

BLOOM

Dense176B2K ctxJul 12, 2022

BigScienceOpen weights

An open, multilingual 176B model (46 languages) from a global research collaboration.

PaLM

Dense540B— ctxApr 4, 2022

Google's 540B Pathways model; the API was later deprecated in favor of Gemini.

EXAONE 1.0

LG AI ResearchProprietary

LG AI Research's first EXAONE foundation model generation, introduced as a large multimodal expert AI.

—Undisc.— ctxDec 14, 2021

ERNIE 3.0 Titan

Dense260B— ctxDec 8, 2021

Baidu's 260B-parameter ERNIE 3.0 Titan model, an early Chinese frontier-scale language model.

Jurassic-1 Jumbo

Dense178B— ctxAug 11, 2021

AI21 LabsProprietary

AI21's first major API language model, launched through AI21 Studio.

GPT-3

Dense175B2K ctxJun 11, 2020

The 175B model that proved in-context learning at scale; its base API models were retired in 2024.

GPT-2

Dense1.5B1K ctxNov 5, 2019

OpenAIOpen source

Initially withheld over misuse fears, then fully released in Nov 2019 — an early 'limited release' debate.

BERT