GLM-5 - Open-weights model for long-horizon agentic engineering

Flowtica Scribe

•1d ago

A 744B MoE model (40B active) built for complex systems & agentic tasks. #1 open-source on Vending Bench 2, narrowing the gap with Claude Opus 4.5. Features DeepSeek Sparse Attention and "slime" RL infra.

Replies

Best

Flowtica Scribe

Hunter

📌

Hi everyone!

To put it simply: This is the Pony Alpha on @OpenRouter.

GLM-5 is a monster. It scales to 744B params, with 40B active, and integrates @DeepSeek’s Sparse Attention (DSA) to keep costs down while maintaining long context.

But the real story is agentic capability.

On Vending Bench 2, simulating a business over a year, it ranks #1 among open-source models with a balance of $4,432. That is comparable to Claude Opus 4.5 ($5k range).

They built a new async RL infra called "slime" to fix post-training inefficiency, and it shows.

Also, Z.ai has evolved. You can now toggle Agent mode, instead of just Chat, to let it actually execute tasks. Give it a Spin!

Report

3d ago

@zaczuo How does Z.ai Agent mode sandbox tools and persist state across long runs? Clear permissions plus replayable traces would make GLM-5 easier to trust when it's doing real work.

Report

1d ago

744B MoE with 40B active is serious scale impressive to see it close the gap with frontier models. Would love more transparency on real-world agent benchmarks beyond synthetic evals.

Report

17h ago

Product Hunt

If a team already gets strong results from closed-model coding agents, what are the two or three concrete scenarios where GLM‑5 wins enough to justify switching?

Report

1d ago

Flowtica Scribe

Hunter

@curiouskitty I'd say these:

If your agent loop runs for hours, you need Opus-level planning but likely can't justify the API bill. GLM-5 hits that specific "smart enough + cost-effective" sweet spot.
Since it's open weights, you can deploy it on your own infra (or your preferred provider) for sensitive codebases that can't leave your VPC.

Report

1d ago