Thinking, math, and agents
Last updated Jun 18, 2026
Reasoning model releases
Models positioned for reasoning, long-horizon tool use, math, coding agents, and other workloads where deliberate problem-solving is the headline feature.
83 models
Kimi K2.7 Code
AvailableMoonshot's open coding-focused agentic model built on K2.6, with native vision/video input, forced thinking mode, and stronger long-horizon software-engineering performance.
GLM-5.2
AvailableZ.ai's latest open flagship for long-horizon coding, agentic engineering, and million-token workflows, adding IndexShare sparse-attention reuse over GLM-5.1.
MiniMax-M3
AvailableNative multimodal MiniMax model with a one-million-token context, sparse attention, and agentic coding/cowork positioning.
GPT-5.6
PreviewOpenAI's mid-2026 flagship, headlined by an industry-leading 1.5M-token context window and long-horizon agentic tool use.
Claude Fable 5
WithdrawnThe public, guardrailed sibling of Mythos and Anthropic's most capable widely-released model, built for long-horizon agentic work. Launched June 9, 2026 across the Claude API, AWS, and Microsoft Foundry — then pulled three days later under a US government export-control directive barring access by foreign nationals.
Nemotron 3 Ultra 550B-A55B
AvailableNVIDIA's largest Nemotron 3 open-weight hybrid Mamba-Transformer MoE, tuned for agentic reasoning, coding, planning, and tool calling.
Claude Opus 4.8
AvailableAnthropic's most capable model, with strengthened agentic and long-running task performance.
MiniMax-M2.7
AvailableOpen-weight agentic model from MiniMax focused on real-world software engineering, office tasks, tool use, and self-improving training workflows.
Gemini 3.5 Pro
PreviewAnnounced at Google I/O 2026; emphasizes deep multimodal reasoning over a 2M-token context.
Qwen3.6-27B
AvailableDense 27B that punches far above its weight on agentic coding — easy to self-host on a single GPU node.
Grok 4.3
AvailablexAI's agentic flagship with a 1M-token context and aggressive API pricing.
Hunyuan-A13B-Instruct
AvailableTencent Hunyuan open-weight fine-grained MoE model with 80B total parameters and 13B active parameters, optimized for agentic tool use.
GLM-5.1
AvailableZ.ai agentic-engineering follow-up to GLM-5, with stronger coding performance and better long-horizon tool-use behavior.
Gemma 4 31B
AvailableGoogle DeepMind's Gemma 4 advanced-reasoning open model for personal computers, part of the April 2026 Gemma 4 family.
Kimi K2.6
AvailableMoonshot's open native multimodal agentic model for long-horizon coding, visual interface generation, and autonomous tool orchestration.
Step-3.5-Flash
AvailableStepFun's Apache-licensed sparse MoE model for fast agentic execution, coding, math, browsing, and tool-use workflows.
Sarvam-105B
AvailableApache-licensed Indian-context MoE from Sarvam AI, optimized for reasoning, coding, agentic tasks, and 22 Indian languages.
GPT-5.4
AvailableWorkhorse GPT-5 release with a dedicated Thinking mode; widely deployed across ChatGPT and the API.
GLM-5
AvailableZ.ai flagship for complex systems engineering and long-horizon agentic tasks, scaling the GLM line to 744B total / 40B active parameters.
Kimi K2.5
AvailableOpen multimodal Kimi model that adds native visual agentic intelligence, instant and thinking modes, and agent-swarm workflows on top of the K2 base.
GLM-4.7
AvailableCoding-focused GLM release with improved multilingual agentic coding, terminal tasks, tool use, and interface generation.
OLMo 3 Think 32B
AvailableAi2's fully open thinking model with public weights, code, data, checkpoints, and training details across the OLMo 3 pipeline.
Nemotron 3 Nano 30B-A3B
AvailableEfficient Nemotron 3 MoE checkpoint for agentic reasoning and coding, activating about 3B parameters while supporting 1M-token contexts.
Mistral Large 3
AvailableMistral's largest open-weight MoE, aimed at frontier reasoning while remaining self-hostable.
DeepSeek-V3.2
AvailableReasoning-first agent model that adds DeepSeek Sparse Attention and thinking directly inside tool-use workflows.
DeepSeek-V3.2-Speciale
AvailableHigh-compute reasoning variant of V3.2, positioned for olympiad-level math, programming, and other deep reasoning tasks.
Kimi K2 Thinking
AvailableOpen K2 reasoning-agent variant that interleaves step-by-step thinking with tool calls and supports stable 200-300 step tool-use trajectories.
GLM-4.6
AvailableAgentic reasoning and coding upgrade over GLM-4.5, expanding the text context window from 128K to 200K tokens.
Kimi K2 Instruct 0905
AvailableSeptember 2025 K2 update with stronger agentic coding, better frontend generation, and a doubled 256K context window.
DeepSeek-V3.1
AvailableHybrid thinking/non-thinking release that upgraded tool calling, long-context training, and agent task performance.
Seed-OSS-36B-Instruct
AvailableByteDance Seed's Apache-licensed long-context reasoning and agent model, with controllable thinking budgets and a native 512K context.
gpt-oss-20b
AvailableSmaller gpt-oss reasoning model optimized for local inference on systems with about 16GB of memory.
gpt-oss-120b
AvailableOpenAI's larger open-weight reasoning model, a 117B-total / 5.1B-active MoE with 128K context for local and self-hosted deployment.
GLM-4.5
AvailableOpen agentic, reasoning, and coding foundation model that marked Z.ai international rebrand and MIT-licensed GLM push.
GLM-4.5-Air
AvailableCompact GLM-4.5 companion with 106B total / 12B active parameters for efficient agentic reasoning and coding.
EXAONE 4.0 32B
AvailableLG AI Research's unified model with non-reasoning and reasoning modes, agentic tool use, and English, Korean, and Spanish support.
Kimi K2 Instruct
AvailableOriginal open K2 post-trained model: a 1T-parameter MoE optimized for coding, reasoning, and tool-using agentic workflows.
SmolLM3 3B
AvailableHugging Face's fully open 3B multilingual long-context model with optional reasoning mode and 128K context.
ERNIE-4.5-VL-424B-A47B
AvailableBaidu's largest ERNIE 4.5 vision-language MoE, supporting text, image, and video inputs with thinking and non-thinking modes.
Kimi-VL-A3B-Thinking-2506
AvailableUpdated MIT-licensed Kimi-VL reasoning model with better multimodal reasoning, video understanding, high-resolution perception, and lower thinking-token use.
MiniMax-M1-80k
AvailableOpen Apache-licensed hybrid-attention reasoning model with 456B total / 45.9B active parameters and a native 1M-token context.
Magistral Medium
AvailableMistral's first dedicated reasoning model family, released in Small open-weight and Medium enterprise/API tiers.
Magistral Small
AvailableOpen-weight 24B reasoning model from Mistral's Magistral family, popular for local reasoning experiments.
DeepSeek-R1-0528
AvailableMajor R1 reasoning update with stronger math, programming, general logic, function calling, and reduced hallucinations.
Claude Opus 4
DeprecatedFirst Claude 4 Opus model, positioned for long-running agentic and coding work before the 4.x point releases.
Seed Thinking v1.5
AvailableByteDance Seed reasoning model focused on long-horizon thinking and problem solving.
Sarvam-M
AvailableSarvam's medium-scale open model for multilingual Indian-language chat, reasoning, and translation tasks.
Phi-4 Reasoning
AvailablePhi-4 reasoning-specialized model family for math, science, and chain-of-thought style tasks.
Qwen3-235B-A22B
AvailableLargest open Qwen3 MoE, introducing hybrid thinking/non-thinking modes and 119-language coverage.
OpenAI o3
AvailableReasoning model released alongside o4-mini with tool use, image reasoning, and stronger agentic problem solving.
Llama-3.3-Nemotron-Super-49B
AvailableOpen Llama Nemotron reasoning model from NVIDIA's 2025 Nemotron family.
DeepSeek-V3-0324
AvailablePost-R1 V3 update with improved reasoning, front-end coding, Chinese writing, search, and function calling.
Gemini 2.5 Pro
DeprecatedReasoning-focused Gemini 2.5 model that made thinking a core part of Google's flagship model line.
ERNIE X1
AvailableBaidu's reasoning model released alongside ERNIE 4.5 before the open ERNIE 4.5 weights.
Granite 3.2 8B
AvailableGranite 3.2 update with reasoning controls and multimodal/document-oriented Granite variants.
Claude 3.7 Sonnet
RetiredAnthropic's first hybrid-reasoning Sonnet. Shut down May 11, 2026 as the 4.x line matured.
DeepHermes 3 Llama 3 8B
AvailableNous reasoning-oriented Hermes model trained to combine concise answers with optional deep reasoning traces.
Grok 3
DeprecatedxAI's third-generation model family, introduced with stronger reasoning, search, and coding modes.
Dolphin 3.0 Llama 3.1 8B
AvailablePopular local assistant model tuned for coding, math, function calling, and agentic workflows.
Qwen2.5-VL-72B
AvailableVision-language Qwen2.5 model for image, document, video, and agentic visual grounding tasks.
Doubao-1.5-pro
AvailableDoubao 1.5 Pro update positioned for stronger multimodal, reasoning, and agentic work in Volcano Engine.
DeepSeek-R1
AvailableBreakout open reasoning model trained with large-scale reinforcement learning and released with weights under MIT.
Kimi k1.5
AvailableMoonshot's multimodal reinforcement-learning reasoning model, reported as matching OpenAI o1 on math, coding, and multimodal reasoning.
Step-2
AvailableSecond-generation StepFun foundation model line with larger-scale multimodal and reasoning ambitions.
Phi-4
AvailableA 14B dense model that rivals far larger ones on math and reasoning, under a permissive MIT license.
Gemini 2.0 Flash
DeprecatedFirst Gemini 2.0 release, built for native multimodal input/output, tool use, and agentic product integrations.
EXAONE 3.5 32B
AvailableEXAONE 3.5 32B open-weight model for bilingual reasoning, coding, and long-context tasks.
OpenAI o1
DeprecatedGeneral release of OpenAI's o1 reasoning model with stronger deliberative reasoning and multimodal ChatGPT integration.
QwQ-32B-Preview
AvailableQwen's first public reasoning-preview model, aimed at math, coding, and deliberate problem solving.
DeepSeek-R1-Lite-Preview
RetiredReasoning-preview model exposed in DeepSeek Chat ahead of the open DeepSeek-R1 release.
Yi-Lightning
Available01.AI's MoE API model that reached the global top-10 on Chatbot Arena, strong in Chinese, math, and coding.
Qwen2.5-72B
AvailableBroad Qwen2.5 foundation-model update spanning general, coding, math, and multimodal descendants.
OpenAI o1-preview
RetiredOpenAI's first public reasoning-model preview, optimized to spend more inference time on hard math, coding, and science tasks.
Grok-2
RetiredSecond-generation Grok release with Grok-2 and Grok-2 mini for chat, coding, reasoning, and image-enabled product experiences.
Claude 3.5 Sonnet
RetiredMajor Sonnet upgrade that became Anthropic's default high-intelligence workhorse for coding, writing, and visual reasoning.
Qwen2-72B
AvailableQwen2's largest dense model, introducing stronger multilingual support, coding/math gains, and long-context variants.
Yi-1.5-34B
AvailableYi 1.5 update with stronger instruction following, coding, math, and multilingual performance.
Grok-1.5
RetiredGrok update with stronger reasoning and a 128K context window.
DBRX Instruct
AvailableDatabricks' 132B-total / 36B-active open MoE model for code, math, RAG, and enterprise self-hosted workloads.
Phi-2
Available2.7B-parameter Phi model showing strong reasoning and language understanding at small scale.
ERNIE 4.0
AvailableBaidu's fourth-generation ERNIE flagship, announced with stronger understanding, generation, reasoning, and memory.
PaLM 2
RetiredGoogle's improved multilingual, reasoning, and coding foundation model family introduced at I/O 2023.
GPT-4
DeprecatedThe model that brought reliable multi-step reasoning to the mainstream; size never disclosed.