When should I switch to Opus 4.8 Fast mode vs the standard endpoint?

Fast mode runs about 2.5× faster and costs $10 input / $50 output per million tokens — three times cheaper than the previous-generation Fast tier. Use it for interactive chat, customer-facing assistants, real-time code completion, or any workload where the user is staring at a spinner. Keep the standard endpoint at $5 / $25 for offline batch work, long-running agents in Claude Code, and any task where you'd rather pay less and wait a bit longer. Both use the same model weights, so output quality is the same — you're just trading latency for cost.

What does Anthropic mean when they call Opus 4.8 their 'most honest' model?

Two concrete things. First, in Anthropic's own evals, 4.8 is roughly four times less likely than Opus 4.7 to let a code bug pass review without flagging it — it says "this looks wrong, here's why" instead of confidently shipping the line. Second, the model is much less prone to overclaiming on questions it can't verify; instead of inventing a plausible answer, it tells you what it doesn't know. The alignment scores behind those behaviors are now comparable to the Claude Mythos Preview model, so the safety upgrade actually landed ahead of the raw capability one.

How do effort levels work in Opus 4.8?

4.8 exposes five settings: Low, Medium, High (the default), xHigh, and Max. Each one buys the model a different reasoning budget before it answers. Low is for snappy chat where you want a reply in under a second. Medium and High cover most general use. xHigh is for hard reasoning — architecture decisions, multi-step refactors, dense legal or financial analysis. Max pushes the model into a multi-minute thinking pass that reaches further than xHigh on 4.7 by a measurable margin on the toughest problems. You pay only for the tokens consumed, so spending more on Max for one critical question is cheap relative to running High on everything.

What are dynamic workflows in Claude Code, and do I need them?

Dynamic workflows let one Opus 4.8 agent spawn hundreds of subagents in parallel, each with its own context and its own task, then merge the results back. The orchestrator stays a single model; the subagents do the work concurrently. You need this any time the job is naturally parallel: refactoring across a large monorepo, running a security audit over many files, fanning out research across a hundred sources, or generating tests for every function in a package. For linear single-file work, the standard Claude Code flow is still the right tool — dynamic workflows pay off when serial waiting was your bottleneck.

Claude Opus 4.8 - Access The World's Most Advanced AI

What is Claude Opus 4.8?

Claude Opus 4.8 is Anthropic's flagship model, released on May 28, 2026 — six weeks after Opus 4.7. It pairs a 1M-token context window with parallel subagent orchestration in Claude Code, five effort levels you control, and a Fast mode that runs roughly 2.5× faster at one-third the previous Fast-tier price.

How to Use Claude Opus 4.8 on Overchat AI

Open Overchat AI

Visit the Overchat AI web app, or install the mobile app, and select Opus 4.8 from the model selector.Overchat AI web app, or install the mobile app, and select Opus 4.6 from the model selector.

Ask Opus 4.8 anything

Write your question and attach PDF, DOCX, or PPT documents, as well as images and videos, and let Opus help you work faster and smarter.

Keep chatting

Keep asking Opus 4.8 until you complete your task, whether it's coding, writing, research, legal work, or anything else.

Get Started

Claude Opus 4.8: Antrophic's Flagship Model

Anthropic shipped Opus 4.8 on May 28, 2026, six weeks after Opus 4.7 — the fastest cadence between flagship Opus releases yet. The headline isn't a single benchmark; it's that 4.8 is, by Anthropic's own measure, four times less likely than 4.7 to let a flawed line of code slip through review. For teams letting Claude write production PRs, that's the number that matters most.

4.8 sits one tier below Claude Mythos, the larger model Anthropic has been previewing inside cybersecurity organizations and plans to roll out broadly in the coming weeks. What's notable is that 4.8's alignment scores already match Mythos Preview — meaning the safety upgrade landed before the raw capability one did.

What changed under the hood

Five effort levels, not one toggle. Opus 4.7 had a single extended-thinking switch. 4.8 exposes Low, Medium, High (default), xHigh, and Max — a real dial. Low buys you a snappy chat reply; Max pushes the model into a multi-minute reasoning pass that reaches further than xHigh on 4.7 by a measurable margin on the hardest problems. You spend tokens where you actually need them, not on every reply.

Dynamic workflows in Claude Code. A single Opus 4.8 agent can now spawn hundreds of subagents in parallel, each with its own context and its own task. For a monorepo refactor, a security audit across a hundred files, or a research fan-out, the wall-clock difference vs the serial 4.7 flow is often the difference between "useful" and "shippable today." The orchestrator stays one model; the subagents do the work.

Fast mode at one-third the previous price. 4.8 introduces a Fast mode that runs roughly 2.5× faster than the standard endpoint at $10 input / $50 output per million tokens — three times cheaper than the prior generation's Fast tier. Standard pricing stays at $5 / $25, unchanged from 4.7. For chat-style use where a user is waiting on the reply, Fast mode is now the default to reach for.

Where 4.8 pulls ahead. On SWE-bench Pro, 4.8 lands at 69.2% — up from 64.3% on 4.7, and 10+ points clear of GPT-5.5 (58.6%) and Gemini 3.1 Pro (54.2%). Computer use on OSWorld-Verified ticks up to 83.4%. The GDPval-AA Elo for economic knowledge work jumps from 1753 to 1890, a bigger generational delta than 4.7 had over 4.6. On Humanity's Last Exam with tools it scores 57.9%, beating GPT-5.5 (52.2%) and Gemini 3.1 Pro (51.4%) by a comfortable margin.

Where it doesn't. The clearest loss is Terminal-Bench 2.1, where 4.8 scores 74.6% versus GPT-5.5's 78.2%. If your workflow is mostly raw terminal coding with no planning and few tools, GPT-5.5 still has the edge there. On most other agentic benchmarks 4.8 is back in front — but it's worth knowing the gap honestly rather than glossing it.

What to actually use it for. For agentic engineering, Claude Code dynamic workflows, computer-use automations, financial analysis (Finance Agent v2: 53.9%), and any task where you'd rather the model say "I'm not sure" than confidently bluff, 4.8 is currently the strongest choice on the market. For high-volume chat where latency matters more than the last few percentage points, Fast mode at one-third the prior cost makes it viable for production workloads it wasn't quite right for before. On Overchat AI, you can start chatting with Opus 4.8 immediately after creating an account — no API key required.

How Opus 4.8 stacks up against the field

Against Opus 4.7. The numbers favour 4.8 across nearly every workload Anthropic publishes. SWE-bench Pro moves from 64.3% to 69.2%, OSWorld-Verified from 82.8% to 83.4%, Humanity's Last Exam (no tools) from 46.9% to 49.8%, Finance Agent v2 from 51.5% to 53.9%, and GDPval-AA from 1753 to 1890. The real qualitative shift, though, is the honesty work: where 4.7 would sometimes write code that looked right and let a subtle flaw through, 4.8 flags uncertainty four times more often. Effort control and Fast mode are net-new — 4.7 didn't have them. If you're already on 4.7, the upgrade is more or less free (same standard pricing) and pays back fastest on agent-driven code review.

Against GPT-5.5 and Gemini 3.1 Pro. On the agentic benchmarks that decide most Claude Code and computer-use workloads, 4.8 has a real lead: SWE-bench Pro is 10.6 points clear of GPT-5.5 and 15 ahead of Gemini 3.1 Pro; OSWorld-Verified sits 4.7 points above GPT-5.5; Humanity's Last Exam with tools leads by 5.7 over GPT-5.5. GDPval-AA economic Elo for Opus 4.8 is 1890 against GPT-5.5's 1769 and Gemini 3.1 Pro's 1314. The honest exception is Terminal-Bench 2.1, where GPT-5.5 (78.2%) still beats 4.8 (74.6%) by 3.6 points — if your stack is mostly bash-driven CI work without planning, that gap is worth knowing. On pricing, 4.8 standard is $5 / $25 per million input/output tokens; GPT-5.5 remains cheaper at the entry tier, but 4.8's new Fast mode at $10 / $50 closes most of that gap for latency-sensitive workloads.

Where Opus 4.8 fits in the lineup. Opus is Anthropic's flagship tier, sitting above Claude Sonnet (the balanced general-purpose model) and Claude Haiku (fast, high-volume). Above Opus 4.8, the larger Claude Mythos model is currently in a controlled preview inside cybersecurity organizations and is expected to land broadly in the coming weeks. All three Claude tiers are available on Overchat AI, so you can match the model to the task without juggling subscriptions.

Claude Opus 4.8

What is Claude Opus 4.8?

What's new in Opus 4.8

Catches bugs others miss window

Parallel subagents in Claude Code

Five effort levels, one model