Zac Zuo

GLM-5 - Open-weights model for long-horizon agentic engineering

A 744B MoE model (40B active) built for complex systems & agentic tasks. #1 open-source on Vending Bench 2, narrowing the gap with Claude Opus 4.5. Features DeepSeek Sparse Attention and "slime" RL infra.

Add a comment

Replies

Best
Zac Zuo

Hi everyone!

To put it simply: This is the Pony Alpha on @OpenRouter.

GLM-5 is a monster. It scales to 744B params, with 40B active, and integrates @DeepSeek’s Sparse Attention (DSA) to keep costs down while maintaining long context.

But the real story is agentic capability.

On Vending Bench 2, simulating a business over a year, it ranks #1 among open-source models with a balance of $4,432. That is comparable to Claude Opus 4.5 ($5k range).

They built a new async RL infra called "slime" to fix post-training inefficiency, and it shows.

Also, Z.ai has evolved. You can now toggle Agent mode, instead of just Chat, to let it actually execute tasks. Give it a Spin!

Curious Kitty
If a team already gets strong results from closed-model coding agents, what are the two or three concrete scenarios where GLM‑5 wins enough to justify switching?
Zac Zuo

@curiouskitty I'd say these:

  1. If your agent loop runs for hours, you need Opus-level planning but likely can't justify the API bill. GLM-5 hits that specific "smart enough + cost-effective" sweet spot.

  2. Since it's open weights, you can deploy it on your own infra (or your preferred provider) for sensitive codebases that can't leave your VPC.

Mykyta Semenov 🇺🇦🇳🇱

Interesting statistics, thank you)