/
Claude Opus 4.6 is now on Overchat AI — Anthropic's Best Model Sets New Records
Last Updated:
Feb 6, 2026

Claude Opus 4.6 is now on Overchat AI — Anthropic's Best Model Sets New Records

Anthropic released Claude Opus 4.6 on February 5, 2026, and it’s a major upgrade to its flagship model.

Among the updates are a 1 million token context window (up from 200,000), adaptive reasoning, which kicks in only when you need it, and general improvements when it comes to coding. Prices stay the same as before.

The model is already live on Overchat AI, and you can start chatting with Claude Opus 4.6 now.

But is this an incremental update, and how much does it improve real workflows? Let’s find out.

Introduction

Without further ado, let's dive in. Opus 4.6 is the new gold standard in coding models. On Terminal-Bench 2.0, it holds the top spot with 65.4%, and on Humanity's Last Exam, it holds 53.1%.

Anthropic coding models are generally considered the best in the world, and this one is no exception. It introduces a new standard, even when used without the 1M context window, which is currently in beta and available only to API users.

What is Claude Opus 4.6?

Claude Opus 4.6 is Anthropic's most advanced AI model, released on February 5, 2026 as an upgrade to Opus 4.5 (which launched in November 2025).

The headline feature is the one-million-token context window (in beta). This represents a fivefold increase over the 200K limit of Opus 4.5, placing it on par with Google's Gemini 3 Pro. Previously, many users noted that Claude models excel at design and front-end coding. However, the limited context window caused them to lose context more easily. This will soon become a thing of the past with the release of the 1M window.

Claude Opus 4.6 Main Features

An increased context window isn’t the only new feature of this model; there are other improvements as well. Most of these improvements only matter for enterprise use cases, but some are fundamental to how the model works. You'll surely notice these improvements when chatting with the model, so let's break them down.

Adaptive Thinking — This replaces the old on/off toggle switch. Similar to ChatGPT, Claude will now either answer right away for easy tasks or activate reasoning for complex ones. In testing, I found this to be iffy — it sometimes switches on reasoning for things I’d consider very simple.

Agent Teams — a powerful feature for enterprise and power users — multiple Claude instances can work in parallel on different parts of a project. Currently in preview.

Context Compaction — This is a server-side summarization feature. When the context is about to be maxed out Claude edits the information it holds to make it smaller so that running tasks don’t stop.

128K Max Output — The amount of text the model can output has been doubled.

On Reddit, early reviews say Opus 4.6 is impressive when it comes to working with and creating legal documents. Its writing ability has improved as well, as it adds fewer predictive phrases and uses a broader vocabulary.

In short, it's a big win in every category.

Claude Opus 4.6 Benchmarks

Benchmark performance gives a rough sense of where Opus 4.6 sits, but the real-world improvements — better planning, self-correction, sustained focus — are what developers are reporting.

Coding:

Benchmark Opus 4.6 Score
Terminal-Bench 2.0 65.4%
SWE-Bench Verified 80.8%
OSWorld (Computer Use) 72.7%
τ2-Bench Retail 91.9%
MCP Atlas 59.5%

Reasoning and knowledge:

Benchmark Claude Opus 4.6
HLE (with tools) 53.1%
HLE (without tools) 40.0%
GDPval-AA 1606 Elo
BrowseComp 84.0%
ARC AGI 2 68.8%
BigLaw Bench 90.2%
Finance Agent 60.7%

Long context retention (higher is better):

Benchmark Opus 4.6 Score Vs Sonnet 4.5 Score
MRCR v2 (1M, 8-needle) 76% 18.5%
MRCR v2 (256K, 8-needle) 93% 10.8%

Claude Opus 4.6 vs Other AI Models

Let’s see how the new model compares against other top models — both from Anthropic and competitors.

Claude Opus 4.6 vs Opus 4.5

Opus 4.6 improves over Opus 4.5 on every benchmark other than the SWE-Bench Verified, where the two are essentially tied (80.8% vs 80.9%).

Benchmark Opus 4.6 Opus 4.5 Improvement
Terminal-Bench 2.0 65.4% 59.8% +5.6pp
OSWorld 72.7% 66.3% +6.4pp
ARC AGI 2 68.8% 37.6% +31.2pp
GDPval-AA 1606 Elo ~1416 Elo +190 Elo
Context Window 1M (beta) 200K 5x increase

Claude Opus 4.6 vs GPT-5.2

Compared to Chat GPT-5.2, Opus 4.6 wins pretty much across the board, although it’s worth mentioning that ChatGPT model prices output tokens lower — $15/M vs $25/M.

Benchmark Opus 4.6 GPT-5.2 Winner
Terminal-Bench 2.0 65.4% 64.7% Opus 4.6
GDPval-AA 1606 Elo ~1462 Elo Opus 4.6
HLE (with tools) 53.1% ~42% Opus 4.6
BrowseComp 84.0% Lower Opus 4.6
SWE-Bench Verified 80.8% 80.0% Opus 4.6
MCP Atlas 59.5% 60.6% GPT-5.2

Claude Opus 4.6 vs Gemini 3 Pro

Gemini 3 Pro is the first model that beats Opus 4.6 in meaningful ways: specifically, when it comes to reasoning and a larger context window. But it is a less powerful AI coder.

Benchmark Opus 4.6 Gemini 3 Pro Winner
Terminal-Bench 2.0 65.4% 56.2% Opus 4.6
OSWorld 72.7% Lower Opus 4.6
GPQA Diamond ~85% 91.9% Gemini 3 Pro
Context Window 1M (beta) 2M Gemini 3 Pro

Claude Opus 4.6 Pricing

Anthropic kept pricing identical to Opus 4.5, which is great, given the performance gains — and a bit surprising. Recently, we saw that prices climbed when more powerful models were introduced, but thankfully, that’s not the case here.

Here’s everything you need to know about the cost of using Opus 4.6, starting with the API pricing, which is as follows:

Token Type Price per 1M Tokens
Input (standard) $5.00
Input (cache read) $0.50
Output $25.00

Next, here’s what long context pricing is like — these prices kick in above 200K tokens:

Token Type Price per 1M Tokens
Input $10.00
Output $37.50

For a bit of context, Opus 4.6 is the most expensive model on this list. Here’s how it compares in terms of price against competitors:

Model Input / 1M Output / 1M
Claude Opus 4.6 $5.00 $25.00
GPT-5.2 ~$5.00 $15.00
Gemini 3 Pro $2.00 $12.00

If you want to chat with Claude Opus 4.6 without worrying about these API prices, you can head to Overchat AI and start chatting with the model as part of a single subscription, which also includes GPT 5.2, Kimi K2, all latest Gemini models, and more.

Bottom Line

Claude Opus 4.6 is Anthropic's strongest model yet. The 1M context window, the auto compacting context, the adaptive reasoning — these features may not be game changing in isolation, but they compound, making for a model that feels better to work with, takes fewer shots at tasks, and performs even more consistently than its already very consistent predecessor.

If you’re interested to test it for yourself, start chatting with Claude Opus 4.6 on Overchat AI today.