Zac Zuo

Nemotron 3 Super - Open hybrid Mamba-Transformer MoE for agentic reasoning

Nemotron 3 Super is NVIDIA"s open 120B model with 12B active parameters, a 1M-token context window, and a hybrid Mamba-Transformer MoE design. It is built for coding, long-context reasoning, and multi-agent workloads without the usual thinking tax.

Add a comment

Replies

Best
Zac Zuo

Hi everyone!

Nemotron 3 Super really stands out because NVIDIA is framing it around two very real agent problems: the "thinking tax" of using a huge reasoning model for every step, and the "context explosion" that happens when long tool loops keep resending history and drift off goal.

This is an open 120B model with 12B active parameters, built to make those workloads more practical. It uses a hybrid Mamba-Transformer LatentMoE design, a native 1M-token context window, and multi-token prediction for faster long generations.

NVIDIA is also releasing more than just weights here. Datasets, recipes, and deployment cookbooks are part of the package too.

P.S. It is also free on @opencode Zen and @OpenRouter right now!