Claude Opus 4.8

Anthropic's flagship Claude model — now four times less likely to miss code bugs, with parallel subagents in Claude Code, five effort levels, and Fast mode at 3× lower cost.

What is Claude Opus 4.8?

Claude Opus 4.8 is Anthropic's flagship model, released on May 28, 2026 — six weeks after Opus 4.7. It pairs a 1M-token context window with parallel subagent orchestration in Claude Code, five effort levels you control, and a Fast mode that runs roughly 2.5× faster at one-third the previous Fast-tier price.

What's new in Opus 4.8

Anthropic's headline claim is that 4.8 is the most honest Claude yet — four times less likely than Opus 4.7 to let a code flaw slip past review. The numbers back it up where it matters most: SWE-bench Pro climbs to 69.2% (up from 64.3% on 4.7, and 10+ points ahead of GPT-5.5 at 58.6%), OSWorld-Verified computer-use sits at 83.4%, and the GDPval-AA economic-work Elo jumps to 1890. GPT-5.5 still wins on Terminal-Bench 2.1 (78.2% vs 74.6%), so on pure terminal coding it remains a real fight — just not the one that decides most agent workflows.

Minimalist UI illustration showing Overchat AI chat and document interface, with layered cards, message bubbles, and simplified icons in blue and white, representing AI-powered communication and content generation.
🦾

Catches bugs others miss window

Four times less likely than Opus 4.7 to wave a flawed line of code through review — so you can ship agent-written PRs with fewer surprise reverts.

⛓️

Parallel subagents in Claude Code

Dynamic workflows let one Opus 4.8 agent spawn hundreds of subagents in parallel — refactor a monorepo or fan out research without serial waiting.

🤖

Five effort levels, one model

Pick Low for a quick reply or Max for a long planning pass. Same model, same context window — you decide how hard it thinks before it answers.

How to Use Claude Opus 4.8 on Overchat AI

1.

Open Overchat AI

Visit the Overchat AI web app, or install the mobile app, and select Opus 4.8 from the model selector.Overchat AI web app, or install the mobile app, and select Opus 4.6 from the model selector.

2.

Ask Opus 4.8 anything

Write your question and attach PDF, DOCX, or PPT documents, as well as images and videos, and let Opus help you work faster and smarter.

3.

Keep chatting

Keep asking Opus 4.8 until you complete your task, whether it's coding, writing, research, legal work, or anything else.

Get Started
Use Claude  Opus 4.6 on Overchat AI

Claude Opus 4.8: Antrophic's Flagship Model

Anthropic shipped Opus 4.8 on May 28, 2026, six weeks after Opus 4.7 — the fastest cadence between flagship Opus releases yet. The headline isn't a single benchmark; it's that 4.8 is, by Anthropic's own measure, four times less likely than 4.7 to let a flawed line of code slip through review. For teams letting Claude write production PRs, that's the number that matters most.

4.8 sits one tier below Claude Mythos, the larger model Anthropic has been previewing inside cybersecurity organizations and plans to roll out broadly in the coming weeks. What's notable is that 4.8's alignment scores already match Mythos Preview — meaning the safety upgrade landed before the raw capability one did.

What changed under the hood

Five effort levels, not one toggle. Opus 4.7 had a single extended-thinking switch. 4.8 exposes Low, Medium, High (default), xHigh, and Max — a real dial. Low buys you a snappy chat reply; Max pushes the model into a multi-minute reasoning pass that reaches further than xHigh on 4.7 by a measurable margin on the hardest problems. You spend tokens where you actually need them, not on every reply.

Dynamic workflows in Claude Code. A single Opus 4.8 agent can now spawn hundreds of subagents in parallel, each with its own context and its own task. For a monorepo refactor, a security audit across a hundred files, or a research fan-out, the wall-clock difference vs the serial 4.7 flow is often the difference between "useful" and "shippable today." The orchestrator stays one model; the subagents do the work.

Fast mode at one-third the previous price. 4.8 introduces a Fast mode that runs roughly 2.5× faster than the standard endpoint at $10 input / $50 output per million tokens — three times cheaper than the prior generation's Fast tier. Standard pricing stays at $5 / $25, unchanged from 4.7. For chat-style use where a user is waiting on the reply, Fast mode is now the default to reach for.

Where 4.8 pulls ahead. On SWE-bench Pro, 4.8 lands at 69.2% — up from 64.3% on 4.7, and 10+ points clear of GPT-5.5 (58.6%) and Gemini 3.1 Pro (54.2%). Computer use on OSWorld-Verified ticks up to 83.4%. The GDPval-AA Elo for economic knowledge work jumps from 1753 to 1890, a bigger generational delta than 4.7 had over 4.6. On Humanity's Last Exam with tools it scores 57.9%, beating GPT-5.5 (52.2%) and Gemini 3.1 Pro (51.4%) by a comfortable margin.

Where it doesn't. The clearest loss is Terminal-Bench 2.1, where 4.8 scores 74.6% versus GPT-5.5's 78.2%. If your workflow is mostly raw terminal coding with no planning and few tools, GPT-5.5 still has the edge there. On most other agentic benchmarks 4.8 is back in front — but it's worth knowing the gap honestly rather than glossing it.

What to actually use it for. For agentic engineering, Claude Code dynamic workflows, computer-use automations, financial analysis (Finance Agent v2: 53.9%), and any task where you'd rather the model say "I'm not sure" than confidently bluff, 4.8 is currently the strongest choice on the market. For high-volume chat where latency matters more than the last few percentage points, Fast mode at one-third the prior cost makes it viable for production workloads it wasn't quite right for before. On Overchat AI, you can start chatting with Opus 4.8 immediately after creating a free account — no API key required.

How Opus 4.8 stacks up against the field

Against Opus 4.7. The numbers favour 4.8 across nearly every workload Anthropic publishes. SWE-bench Pro moves from 64.3% to 69.2%, OSWorld-Verified from 82.8% to 83.4%, Humanity's Last Exam (no tools) from 46.9% to 49.8%, Finance Agent v2 from 51.5% to 53.9%, and GDPval-AA from 1753 to 1890. The real qualitative shift, though, is the honesty work: where 4.7 would sometimes write code that looked right and let a subtle flaw through, 4.8 flags uncertainty four times more often. Effort control and Fast mode are net-new — 4.7 didn't have them. If you're already on 4.7, the upgrade is more or less free (same standard pricing) and pays back fastest on agent-driven code review.

Against GPT-5.5 and Gemini 3.1 Pro. On the agentic benchmarks that decide most Claude Code and computer-use workloads, 4.8 has a real lead: SWE-bench Pro is 10.6 points clear of GPT-5.5 and 15 ahead of Gemini 3.1 Pro; OSWorld-Verified sits 4.7 points above GPT-5.5; Humanity's Last Exam with tools leads by 5.7 over GPT-5.5. GDPval-AA economic Elo for Opus 4.8 is 1890 against GPT-5.5's 1769 and Gemini 3.1 Pro's 1314. The honest exception is Terminal-Bench 2.1, where GPT-5.5 (78.2%) still beats 4.8 (74.6%) by 3.6 points — if your stack is mostly bash-driven CI work without planning, that gap is worth knowing. On pricing, 4.8 standard is $5 / $25 per million input/output tokens; GPT-5.5 remains cheaper at the entry tier, but 4.8's new Fast mode at $10 / $50 closes most of that gap for latency-sensitive workloads.

Where Opus 4.8 fits in the lineup. Opus is Anthropic's flagship tier, sitting above Claude Sonnet (the balanced general-purpose model) and Claude Haiku (fast, high-volume). Above Opus 4.8, the larger Claude Mythos model is currently in a controlled preview inside cybersecurity organizations and is expected to land broadly in the coming weeks. All three Claude tiers are available on Overchat AI, so you can match the model to the task without juggling subscriptions.

FAQ

When should I switch to Opus 4.8 Fast mode vs the standard endpoint?

Fast mode runs about 2.5× faster and costs $10 input / $50 output per million tokens — three times cheaper than the previous-generation Fast tier. Use it for interactive chat, customer-facing assistants, real-time code completion, or any workload where the user is staring at a spinner. Keep the standard endpoint at $5 / $25 for offline batch work, long-running agents in Claude Code, and any task where you'd rather pay less and wait a bit longer. Both use the same model weights, so output quality is the same — you're just trading latency for cost.

What does Anthropic mean when they call Opus 4.8 their 'most honest' model?

Two concrete things. First, in Anthropic's own evals, 4.8 is roughly four times less likely than Opus 4.7 to let a code bug pass review without flagging it — it says "this looks wrong, here's why" instead of confidently shipping the line. Second, the model is much less prone to overclaiming on questions it can't verify; instead of inventing a plausible answer, it tells you what it doesn't know. The alignment scores behind those behaviors are now comparable to the Claude Mythos Preview model, so the safety upgrade actually landed ahead of the raw capability one.

How do effort levels work in Opus 4.8?

4.8 exposes five settings: Low, Medium, High (the default), xHigh, and Max. Each one buys the model a different reasoning budget before it answers. Low is for snappy chat where you want a reply in under a second. Medium and High cover most general use. xHigh is for hard reasoning — architecture decisions, multi-step refactors, dense legal or financial analysis. Max pushes the model into a multi-minute thinking pass that reaches further than xHigh on 4.7 by a measurable margin on the toughest problems. You pay only for the tokens consumed, so spending more on Max for one critical question is cheap relative to running High on everything.

What are dynamic workflows in Claude Code, and do I need them?

Dynamic workflows let one Opus 4.8 agent spawn hundreds of subagents in parallel, each with its own context and its own task, then merge the results back. The orchestrator stays a single model; the subagents do the work concurrently. You need this any time the job is naturally parallel: refactoring across a large monorepo, running a security audit over many files, fanning out research across a hundred sources, or generating tests for every function in a package. For linear single-file work, the standard Claude Code flow is still the right tool — dynamic workflows pay off when serial waiting was your bottleneck.

From The Blog

About Overchat AI

Overchat AI brings you the power of the world's top AI models: ChatGPT, Claude, Gemini, Mistral, and more.

Overchat AI Interface

Explore More AI Models

Chat GPT Logo

GPT-5.4

OpenAI's most advanced model with exceptional reasoning, creativity, and multimodal capabilities.

Ask GPT-5.2 ↗
DeepSeek logo

DeepSeek V3.2

Advanced reasoning model designed for complex problem solving, mathematical reasoning, and programming.

Ask DeepSeek ↗
Claude logo

Claude Opus 4.6

Anthropic's flagship model excelling at reasoning, knowledge, math, and coding tasks.

Ask Claude ↗
Gemini Logo

Gemini 3 Pro

Google's most capable model with advanced multimodal understanding and generation.

Ask Gemini ↗
Grok logo

Grok 4.2

xAI's powerful model with real-time knowledge and witty, direct responses.

Ask Grok ↗
Qwen logo

Qwen 3.5

Alibaba's advanced model with strong multilingual capabilities and reasoning skills.

Ask Qwen ↗

Overchat AI For All Platforms

Available on Web, iOS, and Android. Access your AI assistant anywhere, anytime.

Google Play Store badgeApp Store badge
Overchat AI Desktop and mobile interfaces