Zac Zuo

Claude Opus 4.5 - The best model for coding, agents, and computer use

Claude Opus 4.5 is intelligent, efficient, and the best model in the world for coding, agents, and computer use. It’s also meaningfully better at everyday tasks like deep research and working with slides and spreadsheets. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.

Add a comment

Replies

Best
Zac Zuo

Hi everyone!

It feels like new models are popping up constantly, we now have Claude Opus 4.5.

It’s still focused on coding and "system-level" tasks, but handles vague instructions much better now.

In an agent benchmark, instead of refusing a request to change a basic economy flight, which is restricted, it figured out it could upgrade the cabin first, and then modify the flight. A legitimate, creative path that required genuine insight:

For developers, there’s a new effort parameter to balance cost vs. power, and Claude Code is finally on desktop.

Paul Tseluyko

HUGE competition is coming 🔥 Gemini 3 vs Opus 4.5 🤝

Invince

@pasha_tseluyko I think Opus do better on coding

LucasZhou

@paultseluyko Better and better dev tools coming out is definitely a good thing for us developers.

Abdul Rehman

Claude has honestly been the most trustworthy AI assistant I’ve seen. Glad to see this mission front and center.

Yesith Thomas

Benchmarks looking sweet! Finally a price change too! 🥹

Musharof Chowdhury

Awesome, congrats!

looks like we have to try out Opus 4.5 with Meku to see how much better it is than Sonnet 4.5 in frontend generation 😅

Rick Wise

My project just got roasted by hatable.art, which I found really fun!!

Here is the roast

"4/420 🔥 this digital warzone commits more heinous crimes than my browser history - giving landfill, spaghetti code energy, absolutely foul accessibility violations that make designers weep and Tim Berners-Lee contemplate deleting the entire internet. Straight up criminal UI that's buggier than a rainforest, more broken than my will to live, triggering fight or flight faster than my ex's read receipts. Uninstalled. Deleted. Dispatched. 💻🚮"

I submitted it to Claude Opus 4.5 and I got this amazing evaluation, which I totally agree.

The Reality Check 🤔

The hatable.art roast is intentionally over-the-top and humorous, but let me break down what might have valid kernel of truth:

What the Roast Claims vs. Reality:

Roast Claim

Actual Status

"Spaghetti code energy"

Not really - it's reasonably organized React/Next.js with proper components

"Accessibility violations"

Potentially valid - needs audit

"Buggier than a rainforest"

Unlikely from the code I see - it's fairly straightforward

"Criminal UI"

Subjective, but the design is standard B2B SaaS

Actual Issues I Can Identify:

  1. Accessibility Concerns (valid criticism):

    • Some interactive elements may lack proper ARIA labels

    • Color contrast in gradients might not meet WCAG AA

    • The suppressHydrationWarning props on icons are a code smell

    • Mobile menu keyboard navigation unclear

  2. Design Density:

    • The page is LONG (~700 lines) - might feel overwhelming

    • Too many sections: Hero → Stats → Features → How It Works → Testimonials → AI Comparison → Pricing → FAQ → CTA → Footer

    • Competing visual elements (gradients, cards, badges everywhere)

  3. Performance:

    • Dynamic imports for client component could cause flash

    • Multiple gradient backgrounds and animations

What I'd Do To Improve:

  1. Simplify - Cut sections by 30-40%. Do you need both "AI Comparison" AND detailed pricing cards?

  2. A11y Audit - Run Lighthouse/axe and fix:

    • Add proper focus states

    • Ensure all interactive elements are keyboard accessible

    • Check color contrast ratios

  3. Visual Hierarchy - Reduce competing gradients and badges ("Most Popular", "Best Value", "🔥", "🚀", "💎" all on one screen)

  4. Faster Loading - Consider static rendering for the landing page instead of client-side hydration

  5. Less "Try-Hard" Marketing - Reduce emoji density and trust badges

Bottom Line: The 4/420 score is theatrical, but there's probably ~20% valid feedback buried in the humor. The landing page isn't broken, but it could be tighter and more accessible. Would you like me to run a real accessibility check?

Claude Opus 4.5 was spot on!

Maryam Warraich

Opus 4.5 looks like a strong leap, especially in handling vague or complex instructions.

Val
Glad to see the progress here
Moisés Caicedo

"Finally! I've been waiting for the Opus tier to catch up with the speed/intelligence balance. That jump in Software Engineering performance is massive. Does anyone know if the context window has been increased for this version? This could be the ultimate agent for my repo."

Lilou Lane

Huge congrats! The creative reasoning is getting scary-good — we’re really entering the era of models that problem-solve, not just answer.

12
Next
Last