All tracked models
Last updated Jun 18, 2026
LLM model catalog
A searchable catalog of large language model releases, with source links, lifecycle dates, access details, model size, context window, and modality filters.
229 models
Kimi K2.7 Code
AvailableMoonshot's open coding-focused agentic model built on K2.6, with native vision/video input, forced thinking mode, and stronger long-horizon software-engineering performance.
GLM-5.2
AvailableZ.ai's latest open flagship for long-horizon coding, agentic engineering, and million-token workflows, adding IndexShare sparse-attention reuse over GLM-5.1.
MiniMax-M3
AvailableNative multimodal MiniMax model with a one-million-token context, sparse attention, and agentic coding/cowork positioning.
GPT-5.6
PreviewOpenAI's mid-2026 flagship, headlined by an industry-leading 1.5M-token context window and long-horizon agentic tool use.
Claude Fable 5
WithdrawnThe public, guardrailed sibling of Mythos and Anthropic's most capable widely-released model, built for long-horizon agentic work. Launched June 9, 2026 across the Claude API, AWS, and Microsoft Foundry — then pulled three days later under a US government export-control directive barring access by foreign nationals.
Nemotron 3 Ultra 550B-A55B
AvailableNVIDIA's largest Nemotron 3 open-weight hybrid Mamba-Transformer MoE, tuned for agentic reasoning, coding, planning, and tool calling.
Claude Opus 4.8
AvailableAnthropic's most capable model, with strengthened agentic and long-running task performance.
MiniMax-M2.7
AvailableOpen-weight agentic model from MiniMax focused on real-world software engineering, office tasks, tool use, and self-improving training workflows.
Gemini 3.5 Pro
PreviewAnnounced at Google I/O 2026; emphasizes deep multimodal reasoning over a 2M-token context.
Qwen3.6-27B
AvailableDense 27B that punches far above its weight on agentic coding — easy to self-host on a single GPU node.
Grok 4.3
AvailablexAI's agentic flagship with a 1M-token context and aggressive API pricing.
DeepSeek V4-Flash
PreviewEfficient V4 companion model with 284B total / 13B active parameters and the same one-million-token context window.
DeepSeek V4-Pro
PreviewPreview-series sparse MoE flagship with a one-million-token context window and 1.6T total / 49B active parameters.
Hunyuan-A13B-Instruct
AvailableTencent Hunyuan open-weight fine-grained MoE model with 80B total parameters and 13B active parameters, optimized for agentic tool use.
GLM-5.1
AvailableZ.ai agentic-engineering follow-up to GLM-5, with stronger coding performance and better long-horizon tool-use behavior.
Claude Mythos
PreviewA frontier model Anthropic disclosed on April 7, 2026 but declined to release publicly, citing security risk. Shipped only via 'Project Glasswing' to ~50 defensive-security partners, then suspended on June 12, 2026 under a US government directive.
Gemma 4 31B
AvailableGoogle DeepMind's Gemma 4 advanced-reasoning open model for personal computers, part of the April 2026 Gemma 4 family.
Kimi K2.6
AvailableMoonshot's open native multimodal agentic model for long-horizon coding, visual interface generation, and autonomous tool orchestration.
Mistral Medium 3.5
AvailableDense 128B open-weight model with a 256k context and strong coding performance for its size.
Nemotron 3 Super 120B-A12B
AvailableOpen-weight hybrid Mamba-Transformer MoE designed for collaborative agents and high-volume enterprise workflows.
Step-3.5-Flash
AvailableStepFun's Apache-licensed sparse MoE model for fast agentic execution, coding, math, browsing, and tool-use workflows.
Sarvam-105B
AvailableApache-licensed Indian-context MoE from Sarvam AI, optimized for reasoning, coding, agentic tasks, and 22 Indian languages.
GPT-5.4
AvailableWorkhorse GPT-5 release with a dedicated Thinking mode; widely deployed across ChatGPT and the API.
Qwen3.5-397B
AvailableNative vision-language MoE supporting 201 languages with a 1M-token context.
Gemini 3.1 Pro
AvailableGenerally available multimodal flagship with native tool use and a 2M-token context.
GLM-5
AvailableZ.ai flagship for complex systems engineering and long-horizon agentic tasks, scaling the GLM line to 744B total / 40B active parameters.
Claude Opus 4.6
AvailableIntroduced genuinely autonomous multi-file coding and stronger computer use.
Qwen3-Coder-Next
AvailableApache-licensed Qwen3-Next coding-agent model with 80B total / 3B active parameters, 256K context, and long-horizon tool-use training.
Kimi K2.5
AvailableOpen multimodal Kimi model that adds native visual agentic intelligence, instant and thinking modes, and agent-swarm workflows on top of the K2 base.
GLM-4.7
AvailableCoding-focused GLM release with improved multilingual agentic coding, terminal tasks, tool use, and interface generation.
OLMo 3 Think 32B
AvailableAi2's fully open thinking model with public weights, code, data, checkpoints, and training details across the OLMo 3 pipeline.
Nemotron 3 Nano 30B-A3B
AvailableEfficient Nemotron 3 MoE checkpoint for agentic reasoning and coding, activating about 3B parameters while supporting 1M-token contexts.
GLM-4.6V
AvailableOpen 106B-class vision-language model with native multimodal function calling for visual agents.
Mistral Large 3
AvailableMistral's largest open-weight MoE, aimed at frontier reasoning while remaining self-hostable.
DeepSeek-V3.2
AvailableReasoning-first agent model that adds DeepSeek Sparse Attention and thinking directly inside tool-use workflows.
DeepSeek-V3.2-Speciale
AvailableHigh-compute reasoning variant of V3.2, positioned for olympiad-level math, programming, and other deep reasoning tasks.
LFM2 1.2B
AvailableLiquid AI hybrid model for efficient CPU/GPU/NPU local deployment, using short convolutions plus attention blocks.
Kimi K2 Thinking
AvailableOpen K2 reasoning-agent variant that interleaves step-by-step thinking with tool calls and supports stable 200-300 step tool-use trajectories.
Kimi-Linear-48B-A3B-Instruct
AvailableMIT-licensed hybrid linear-attention model using Kimi Delta Attention, built for million-token contexts with much lower KV-cache usage.
GLM-4.6
AvailableAgentic reasoning and coding upgrade over GLM-4.5, expanding the text context window from 128K to 200K tokens.
DeepSeek-V3.2-Exp
PreviewExperimental checkpoint that introduced DeepSeek Sparse Attention as an efficiency bridge between V3.1-Terminus and V3.2.
DeepSeek-V3.1-Terminus
AvailableStability update to V3.1 focused on language consistency, code-agent reliability, and search-agent behavior.
Kimi K2 Instruct 0905
AvailableSeptember 2025 K2 update with stronger agentic coding, better frontend generation, and a doubled 256K context window.
Gemma 3 27B
AvailableGoogle's open multimodal model: 128k context, 140+ languages, runs on a single GPU.
DeepSeek-V3.1
AvailableHybrid thinking/non-thinking release that upgraded tool calling, long-context training, and agent task performance.
Seed-OSS-36B-Instruct
AvailableByteDance Seed's Apache-licensed long-context reasoning and agent model, with controllable thinking budgets and a native 512K context.
GLM-4.5V
AvailableVision-language GLM based on GLM-4.5-Air, covering image, video, document, grounding, and GUI-agent tasks.
gpt-oss-20b
AvailableSmaller gpt-oss reasoning model optimized for local inference on systems with about 16GB of memory.
gpt-oss-120b
AvailableOpenAI's larger open-weight reasoning model, a 117B-total / 5.1B-active MoE with 128K context for local and self-hosted deployment.
Falcon-H1 34B
AvailableA hybrid attention + state-space-model (SSM) design that matches 70B-class models with fewer parameters.
GLM-4.5
AvailableOpen agentic, reasoning, and coding foundation model that marked Z.ai international rebrand and MIT-licensed GLM push.
GLM-4.5-Air
AvailableCompact GLM-4.5 companion with 106B total / 12B active parameters for efficient agentic reasoning and coding.
EXAONE 4.0 32B
AvailableLG AI Research's unified model with non-reasoning and reasoning modes, agentic tool use, and English, Korean, and Spanish support.
Kimi K2 Instruct
AvailableOriginal open K2 post-trained model: a 1T-parameter MoE optimized for coding, reasoning, and tool-using agentic workflows.
Grok 4
DeprecatedxAI's fourth-generation Grok line, preceding the later 4.x API updates already tracked in the catalog.
SmolLM3 3B
AvailableHugging Face's fully open 3B multilingual long-context model with optional reasoning mode and 128K context.
ERNIE-4.5-300B-A47B
AvailableBaidu's open ERNIE 4.5 language MoE, part of a 10-variant Apache-licensed model family built with heterogeneous multimodal MoE training.
ERNIE-4.5-VL-424B-A47B
AvailableBaidu's largest ERNIE 4.5 vision-language MoE, supporting text, image, and video inputs with thinking and non-thinking modes.
Kimi-VL-A3B-Thinking-2506
AvailableUpdated MIT-licensed Kimi-VL reasoning model with better multimodal reasoning, video understanding, high-resolution perception, and lower thinking-token use.
Kimi-Dev-72B
AvailableMIT-licensed coding LLM trained with repository-level reinforcement learning for software issue resolution.
MiniMax-M1-80k
AvailableOpen Apache-licensed hybrid-attention reasoning model with 456B total / 45.9B active parameters and a native 1M-token context.
Magistral Medium
AvailableMistral's first dedicated reasoning model family, released in Small open-weight and Medium enterprise/API tiers.
Magistral Small
AvailableOpen-weight 24B reasoning model from Mistral's Magistral family, popular for local reasoning experiments.
DeepSeek-R1-0528
AvailableMajor R1 reasoning update with stronger math, programming, general logic, function calling, and reduced hallucinations.
Claude Opus 4
DeprecatedFirst Claude 4 Opus model, positioned for long-running agentic and coding work before the 4.x point releases.
Seed Thinking v1.5
AvailableByteDance Seed reasoning model focused on long-horizon thinking and problem solving.
Sarvam-M
AvailableSarvam's medium-scale open model for multilingual Indian-language chat, reasoning, and translation tasks.
Phi-4 Reasoning
AvailablePhi-4 reasoning-specialized model family for math, science, and chain-of-thought style tasks.
Granite 3.3 8B
AvailableGranite 3.3 text update for enterprise chat, RAG, and instruction-following workflows.
Qwen3-235B-A22B
AvailableLargest open Qwen3 MoE, introducing hybrid thinking/non-thinking modes and 119-language coverage.
Kimi-Audio-7B-Instruct
AvailableOpen audio foundation model for audio understanding, generation, speech recognition, audio QA, captioning, and speech conversation.
Kimi-VL-A3B-Instruct
AvailableEfficient MIT-licensed vision-language MoE for OCR, image/video understanding, long documents, and OS-style agent tasks.
OpenAI o3
AvailableReasoning model released alongside o4-mini with tool use, image reasoning, and stronger agentic problem solving.
GPT-4.1
DeprecatedAPI model family focused on coding, instruction following, and one-million-token long-context work.
Llama 4 Maverick
AvailableMeta's flagship open-weight MoE; highest MMLU among open models at release.
Llama 4 Scout
AvailableEfficient open-weight MoE designed for very long context on modest hardware.
Llama-3.3-Nemotron-Super-49B
AvailableOpen Llama Nemotron reasoning model from NVIDIA's 2025 Nemotron family.
Qwen2.5-Omni-7B
AvailableLocal omni-modal Qwen model that supports text, image, audio, video, and speech generation in a 7B package.
DeepSeek-V3-0324
AvailablePost-R1 V3 update with improved reasoning, front-end coding, Chinese writing, search, and function calling.
Gemini 2.5 Pro
DeprecatedReasoning-focused Gemini 2.5 model that made thinking a core part of Google's flagship model line.
Mistral Small 3.1
AvailableApache-licensed Small update adding vision and a 128K context window to the efficient 24B line.
ERNIE X1
AvailableBaidu's reasoning model released alongside ERNIE 4.5 before the open ERNIE 4.5 weights.
OLMo 2 32B
AvailableA fully open model — weights, data, and training code all public — and the first such to beat GPT-3.5 / GPT-4o mini.
Command A
AvailableEnterprise-grade model tuned for RAG, tool use, and multilingual business workloads.
Granite 3.2 8B
AvailableGranite 3.2 update with reasoning controls and multimodal/document-oriented Granite variants.
Claude 3.7 Sonnet
RetiredAnthropic's first hybrid-reasoning Sonnet. Shut down May 11, 2026 as the 4.x line matured.
Moonlight-16B-A3B-Instruct
AvailableMIT-licensed 16B/3B-active MoE trained with Moonshot's scalable Muon optimizer experiments.
DeepHermes 3 Llama 3 8B
AvailableNous reasoning-oriented Hermes model trained to combine concise answers with optional deep reasoning traces.
Grok 3
DeprecatedxAI's third-generation model family, introduced with stronger reasoning, search, and coding modes.
Dolphin 3.0 Llama 3.1 8B
AvailablePopular local assistant model tuned for coding, math, function calling, and agentic workflows.
Mistral Small 3
AvailableA latency-optimized 24B dense model under Apache-2.0 — a popular local-deployment workhorse.
Qwen2.5-Max
AvailableProprietary MoE flagship for the Qwen2.5 generation, released through Qwen Chat and Alibaba Cloud APIs.
Qwen2.5-VL-72B
AvailableVision-language Qwen2.5 model for image, document, video, and agentic visual grounding tasks.
Doubao-1.5-pro
AvailableDoubao 1.5 Pro update positioned for stronger multimodal, reasoning, and agentic work in Volcano Engine.
DeepSeek-R1
AvailableBreakout open reasoning model trained with large-scale reinforcement learning and released with weights under MIT.
Kimi k1.5
AvailableMoonshot's multimodal reinforcement-learning reasoning model, reported as matching OpenAI o1 on math, coding, and multimodal reasoning.
MiniMax-01
AvailableOpen MiniMax generation with MiniMax-Text-01 and MiniMax-VL-01 long-context models.
DeepSeek-V3
AvailableThe 671B/37B-active MoE release that made DeepSeek a central open-model lab before the R1 breakthrough.
Step-2
AvailableSecond-generation StepFun foundation model line with larger-scale multimodal and reasoning ambitions.
Granite 3.1 8B
AvailableIBM's enterprise-focused open model with a 128k context, Apache-2.0 licensed.
Falcon 3 10B
AvailableUAE's TII open model designed to run on light infrastructure, including laptops.
Command R7B
AvailableCohere's smallest, fastest R-series model, tuned for RAG and tool use on modest hardware.
Phi-4
AvailableA 14B dense model that rivals far larger ones on math and reasoning, under a permissive MIT license.
Gemini 2.0 Flash
DeprecatedFirst Gemini 2.0 release, built for native multimodal input/output, tool use, and agentic product integrations.
EXAONE 3.5 32B
AvailableEXAONE 3.5 32B open-weight model for bilingual reasoning, coding, and long-context tasks.
Llama 3.3 70B
AvailableLate-2024 70B Llama update delivering much of the 405B instruction-following quality at lower serving cost.
OpenAI o1
DeprecatedGeneral release of OpenAI's o1 reasoning model with stronger deliberative reasoning and multimodal ChatGPT integration.
Amazon Nova Pro
AvailableAWS-native multimodal model with a 300k context; size and architecture undisclosed.
Amazon Nova Lite
AvailableLower-cost multimodal Nova understanding model for text, image, and video inputs.
QwQ-32B-Preview
AvailableQwen's first public reasoning-preview model, aimed at math, coding, and deliberate problem solving.
Tulu 3 405B
AvailableAi2's post-trained open instruction model line, scaling the Tulu recipe to Llama 3.1 405B.
DeepSeek-R1-Lite-Preview
RetiredReasoning-preview model exposed in DeepSeek Chat ahead of the open DeepSeek-R1 release.
Qwen2.5-Coder-32B
AvailableCode-specialized Qwen2.5 model family, with the 32B checkpoint as the flagship open coding model.
Hunyuan-Large
AvailableTencent's 389B total / 52B active open-weight Transformer MoE, released with a 256K pretraining context and 128K instruct context.
SmolLM2 1.7B
AvailableCompact on-device model family trained on 11T tokens, popular for lightweight local chat and experimentation.
Claude 3.5 Haiku
DeprecatedFast, lower-cost Claude 3.5 model for latency-sensitive coding, tool-use, and customer-facing workloads.
Sarvam-1
AvailableSarvam's 2B open model trained for ten major Indian languages.
Granite 3.0 8B
AvailableApache-licensed Granite 3.0 text model, part of IBM's push toward enterprise-friendly open models.
Yi-Lightning
Available01.AI's MoE API model that reached the global top-10 on Chatbot Arena, strong in Chinese, math, and coding.
Ministral 8B
AvailableSmall Mistral model line optimized for edge and low-latency workloads.
Llama-3.1-Nemotron-70B
AvailableNVIDIA-tuned Llama 3.1 70B instruction model optimized with Nemotron reward and alignment recipes.
Llama 3.2 90B Vision
AvailableFirst Llama family release with native vision models, alongside smaller edge-oriented 1B and 3B text models.
Molmo 72B
AvailableOpen multimodal model family trained for strong image understanding, pointing, and visual grounding.
Qwen2.5-72B
AvailableBroad Qwen2.5 foundation-model update spanning general, coding, math, and multimodal descendants.
Pixtral 12B
AvailableMistral's first open multimodal model, adding image understanding to a Mistral text backbone.
OpenAI o1-preview
RetiredOpenAI's first public reasoning-model preview, optimized to spend more inference time on hard math, coding, and science tasks.
Yi-Coder-9B
Available01.AI's compact code model trained for repository-scale programming and code completion tasks.
DeepSeek-V2.5
AvailableUnified DeepSeek V2 generation combining general-chat and coding strengths before the V3 series.
Hunyuan Turbo
AvailableTencent's faster, lower-cost Hunyuan update before the open Hunyuan-Large model card.
OLMoE 1B-7B
AvailableFully open sparse MoE model with 7B total and about 1B active parameters.
Jamba 1.5 Large
AvailableIsrael's AI21 hybrid Mamba-Transformer MoE, with a 256k context and strong long-document throughput.
Phi-3.5 MoE
AvailablePhi-3.5 mixture-of-experts model, scaling Microsoft's small-model line while preserving efficient active parameters.
Hermes 3 Llama 3.1 405B
AvailableLarge Hermes 3 instruction-tuned model built on Meta's Llama 3.1 405B.
Grok-2
RetiredSecond-generation Grok release with Grok-2 and Grok-2 mini for chat, coding, reasoning, and image-enabled product experiences.
EXAONE 3.0 7.8B
AvailableLG's first open-weight EXAONE model, a compact bilingual instruction model for Korean and English.
MiniCPM-V 2.6
Available8B vision-language model for local image, multi-image, OCR, and video understanding, with llama.cpp and Ollama support.
Llama 3.1 405B
AvailableMeta's first frontier-scale open Llama model, with 405B parameters, 128K context, multilingual support, and tool-use improvements.
Mistral NeMo
AvailableApache-licensed 12B model co-developed with NVIDIA, including a 128K context window and strong multilingual tokenization.
Gemma 2 27B
AvailableSecond-generation Gemma model, improving open-weight quality and efficiency at 9B and 27B sizes.
Claude 3.5 Sonnet
RetiredMajor Sonnet upgrade that became Anthropic's default high-intelligence workhorse for coding, writing, and visual reasoning.
DeepSeek-Coder-V2
AvailableOpen code-focused MoE built from DeepSeek-V2, expanding programming-language coverage and coding benchmark performance.
Nemotron-4 340B
AvailableNVIDIA's large open model family for synthetic data generation and reward modeling.
Qwen2-72B
AvailableQwen2's largest dense model, introducing stronger multilingual support, coding/math gains, and long-context variants.
GLM-4-9B
AvailableOpen GLM-4 9B model family, covering chat, long-context, and code-oriented variants.
Codestral 22B
AvailableMistral's first code-specialized model, trained for code generation, fill-in-the-middle, and multi-language programming tasks.
Aya 23 35B
AvailableOpen multilingual research model covering 23 languages, released by Cohere For AI.
Doubao-pro
AvailableByteDance's commercial Doubao foundation model line for text, code, and assistant workloads.
GPT-4o
RetiredThe 2024 omni-modal model that defined a generation of assistants. Deprecated in Feb 2026 and fully retired across ChatGPT on April 3, 2026.
Yi-1.5-34B
AvailableYi 1.5 update with stronger instruction following, coding, math, and multilingual performance.
Falcon 2 11B
AvailableFalcon 2 generation, including text and vision-language 11B models under a permissive TII license.
DeepSeek-V2
AvailableDeepSeek's first major MoE general model with Multi-head Latent Attention and low-cost API positioning.
Granite Code 34B
AvailableApache-2.0 code model from IBM's Granite Code family, used for local code generation and enterprise coding assistants.
Amazon Titan Text Premier
AvailableLarger Titan text model for enterprise RAG, summarization, and agent workflows in Amazon Bedrock.
Snowflake Arctic
AvailableApache-2.0 enterprise LLM with 480B total / 17B active parameters, optimized for SQL, code, and instruction following.
Phi-3 Mini
Available3.8B-parameter Phi-3 model released as a phone-capable small model with 4K and 128K variants.
Llama 3 70B
AvailableFirst Llama 3 release, with 8B and 70B open models and a stronger tokenizer, data mix, and post-training stack.
Mixtral 8x22B
AvailableLarger open Mixtral sparse MoE with 141B total and 39B active parameters, released under Apache-2.0.
abab6.5
AvailableMiniMax's commercial long-context abab model generation before the open MiniMax-01 and M series.
WizardLM-2 8x22B
AvailableMicrosoft's WizardLM-2 MoE chat model, widely mirrored and run locally after its model-card release.
Step-1V
AvailableStepFun's first major vision-language model, released after the Step-1 language model.
CodeGemma 7B
AvailableOpen code-specialized Gemma model for local code completion, generation, and instruction-following.
Command R+
DeprecatedHigher-capability RAG and tool-use model in Cohere's Command R family.
Grok-1.5
RetiredGrok update with stronger reasoning and a 128K context window.
Jamba
AvailableFirst Jamba hybrid Transformer-Mamba MoE model with open weights and a 256K context length.
DBRX Instruct
AvailableDatabricks' 132B-total / 36B-active open MoE model for code, math, RAG, and enterprise self-hosted workloads.
Step-1
AvailableStepFun's first public foundation model generation, introduced as a trillion-parameter Chinese model line.
Kimi 1M
AvailableLong-context Kimi upgrade advertised with support for million-character document and conversation contexts.
Command R
DeprecatedEnterprise RAG-focused model with tool use, citations, multilingual retrieval, and long-context support.
Claude 3 Opus
DeprecatedHighest-capability Claude 3 model, launched with Sonnet and Haiku and Anthropic's first major vision-capable Claude family.
StarCoder2 15B
AvailableNext-generation BigCode code model trained on 4T+ tokens and 600+ programming languages, with 16K context.
Mistral Large
DeprecatedMistral's first proprietary flagship API model, introduced alongside Le Chat and stronger multilingual/coding performance.
Gemma 7B
AvailableFirst Gemma open-weight text model family, derived from the same research lineage as Gemini.
Gemini 1.5 Pro
DeprecatedGemini generation that introduced production-scale long context, eventually expanding to a two-million-token window.
Qwen1.5-110B
AvailableLargest Qwen1.5 model, released as the bridge from the original Qwen line to Qwen2.
OLMo 7B
AvailableAi2's first fully open language model release, including weights, training data, code, logs, and intermediate checkpoints.
Stable LM 2 1.6B
AvailableSmall multilingual Stable LM release built for low hardware barriers and local experimentation.
GLM-4
AvailableZhipu's GLM-4 flagship generation, launched as the successor to ChatGLM3 with stronger tool use and multimodal variants.
DeepSeekMoE 16B
AvailableEarly DeepSeek sparse MoE research model that foreshadowed the later V2/V3 architecture direction.
Nous Hermes 2 Mixtral
AvailableNous instruction-tuned Mixtral model with strong open-chat and tool-use adoption.
OpenChat 3.5
AvailableCompact Mistral-based local chat model trained with C-RLFT, popular in early 2024 local leaderboards.
TinyLlama 1.1B Chat
AvailableCompact Llama-style 1.1B chat model trained for local experimentation and low-memory deployments.
Phi-2
Available2.7B-parameter Phi model showing strong reasoning and language understanding at small scale.
OpenHathi-7B
AvailableSarvam AI's first open Indic language model, adapted from Llama 2 for Hindi and Indian-language work.
Mixtral 8x7B
AvailableThe open sparse Mixture-of-Experts that brought MoE efficiency to the open ecosystem.
Gemini 1.0 Ultra
DeprecatedGoogle's first natively multimodal Gemini flagship, since superseded by the 1.5/2/3 lines.
Qwen-72B
AvailableAlibaba's first major open Qwen model and the start of a prolific open-weight line.
DeepSeek LLM 67B
AvailableFirst general DeepSeek language model family, with 7B and 67B base/chat checkpoints.
Claude 2.1
RetiredClaude update with a 200K context window, lower hallucination rates, and improved tool-use beta support.
Yi-34B
Available01.AI's strong bilingual open model, with a 200k-context variant.
GPT-4 Turbo
DeprecatedLower-cost GPT-4 generation with a 128K context window, introduced at OpenAI DevDay.
Grok-1
AvailablexAI's first Grok model, later released as open weights with a 314B-parameter MoE checkpoint.
DeepSeek Coder 33B
AvailableDeepSeek's first public code-model family, released before the general DeepSeek LLM line.
ERNIE 4.0
AvailableBaidu's fourth-generation ERNIE flagship, announced with stronger understanding, generation, reasoning, and memory.
Kimi Chat
AvailableMoonshot's first Kimi assistant release, establishing the long-context product line before the open Kimi model cards.
LLaVA 1.5 13B
AvailableOpen vision-language assistant and one of the most widely run early local multimodal models.
Amazon Titan Text Express
AvailableAmazon's first-party Titan text generation model exposed through Bedrock, initially alongside embeddings and image models.
Mistral 7B
AvailableThe 7B that punched far above its weight and put Mistral on the map.
Qwen-14B
AvailableSecond open Qwen size, expanding the first-generation Qwen language-model lineup.
Granite 13B
AvailableIBM's early Granite foundation model family for enterprise language and code tasks.
Hunyuan
AvailableTencent's first Hunyuan foundation model release, introduced as a general-purpose Chinese enterprise model.
Falcon 180B
AvailableAt launch the largest openly available model, from the UAE's TII.
Code Llama 34B
AvailableMeta's first code-specialized Llama model family, released in base, Python, and instruction-tuned variants.
Qwen-7B
AvailableAlibaba's first open Qwen checkpoint and the start of the Qwen open-model line.
Nous-Hermes-Llama2-13B
AvailableEarly Nous Hermes instruction model on Llama 2, widely used in the open-model fine-tuning ecosystem.
EXAONE 2.0
RetiredSecond EXAONE generation, improving bilingual Korean-English performance and enterprise deployment options.
Llama 2 70B
AvailableThe release that made capable open-weight models genuinely usable for production.
Claude 2
RetiredAnthropic's first widely-available Claude, notable for an early 100k-token context window.
ChatGLM2-6B
AvailableSecond open ChatGLM generation, improving long context, inference efficiency, and bilingual chat quality.
Phi-1
AvailableMicrosoft's first Phi small-language-model release, demonstrating strong code performance from textbook-quality synthetic data.
Falcon 40B
AvailableTII's breakout open Falcon model, released before Falcon 180B and trained on the RefinedWeb corpus.
PaLM 2
RetiredGoogle's improved multilingual, reasoning, and coding foundation model family introduced at I/O 2023.
MPT-7B
AvailableMosaicML's permissively licensed 7B model, an early favorite for commercial local fine-tuning and long-context variants.
Vicuna 13B
AvailableLMSYS instruction-tuned LLaMA model that became a landmark early local ChatGPT-style assistant.
ERNIE Bot
AvailableBaidu's public chat assistant launch, built on the ERNIE foundation-model line.
GPT-4
DeprecatedThe model that brought reliable multi-step reasoning to the mainstream; size never disclosed.
ChatGLM-6B
AvailableZhipu AI and Tsinghua KEG's first widely used open bilingual ChatGLM checkpoint.
Claude 1
RetiredAnthropic's first broadly announced Claude assistant model, launched through an API and select product partners.
Jurassic-2 Ultra
DeprecatedSecond-generation Jurassic model with better multilingual support, lower latency, and instruction following.
GPT-3.5 Turbo
RetiredOpenAI's first ChatGPT API model, bringing the GPT-3.5 chat-tuned line to developers at much lower cost than text-davinci-003.
LLaMA
AvailableMeta's first LLaMA, released to researchers; its leak catalyzed the open-weight movement.
Galactica
WithdrawnA science-focused model whose public demo was withdrawn after just three days over confidently wrong outputs — an early, instructive retraction.
BLOOM
AvailableAn open, multilingual 176B model (46 languages) from a global research collaboration.
PaLM
RetiredGoogle's 540B Pathways model; the API was later deprecated in favor of Gemini.
EXAONE 1.0
RetiredLG AI Research's first EXAONE foundation model generation, introduced as a large multimodal expert AI.
ERNIE 3.0 Titan
RetiredBaidu's 260B-parameter ERNIE 3.0 Titan model, an early Chinese frontier-scale language model.
Jurassic-1 Jumbo
RetiredAI21's first major API language model, launched through AI21 Studio.
GPT-3
RetiredThe 175B model that proved in-context learning at scale; its base API models were retired in 2024.
GPT-2
AvailableInitially withheld over misuse fears, then fully released in Nov 2019 — an early 'limited release' debate.
BERT
AvailableThe bidirectional encoder that reshaped NLP and seeded the transformer era.