NVIDIA Nemotron-Cascade 2: High-Density 30B MoE with Gold Medal Reasoning

NVIDIA Releases Nemotron-Cascade 2: An Open 30B MoE with 3B Active Parameters, Delivering Better Reasoning and Strong Agentic Capabilities

NVIDIA has launched Nemotron-Cascade 2, an open-weight 30B Mixture-of-Experts model with 3B activated parameters. It is the second open-weight LLM to achieve Gold Medal-level performance in the 2025 International Mathematical Olympiad and ICPC World Finals.

Why This Matters

Frontier-scale intelligence often requires massive parameter counts, leading to high inference costs and latency. Nemotron-Cascade 2 shifts the focus to ‘intelligence density,’ proving that domain-specific reinforcement learning and on-policy distillation can deliver state-of-the-art reasoning in math and coding at a fraction of the scale used by 100B+ parameter models.

Key Insights

Superior mathematical reasoning: Nemotron-Cascade 2 scored 92.4 on AIME 2025, outperforming Qwen3.5-35B-A3B’s score of 91.9.
Enhanced coding performance: The model achieved 439.28 on IOI 2025, significantly higher than Qwen3.5-35B-A3B’s 348.6.
Multi-Domain On-Policy Distillation (MOPD): This technique reached AIME25 teacher-level performance in 30 steps, proving more sample-efficient than GRPO.
Extended context training: NVIDIA utilized a curated dataset with sequences packed up to 256K tokens during the SFT phase, including 1.9M Python reasoning traces.
Instruction following excellence: The model scored 83.5 on ArenaHard v2, surpassing the larger Nemotron-3-Super-120B-A12B in alignment benchmarks.

Practical Applications

Competitive Programming and Math: Leverage Thinking Mode by initiating prompts with the token for complex logic. Pitfall: Using direct responses for multi-step proofs may bypass the model’s specialized reasoning traces.
Agentic Tool Interaction: Implement structured tool-calling within <tool_call> tags for verifiable software engineering workflows. Pitfall: Failing to provide the tool list within tags in the system prompt prevents the model from correctly formatting requests.

References:

https://www.marktechpost.com/2026/03/20/nvidia-releases-nemotron-cascade-2-an-open-30b-moe-with-3b-active-parameters-delivering-better-reasoning-and-strong-agentic-capabilities/

On This Page

NVIDIA Releases Nemotron-Cascade 2: An Open 30B MoE with 3B Active Parameters, Delivering Better Reasoning and Strong Agentic Capabilities

Why This Matters

Key Insights

Practical Applications

Continue reading

Related Content

NVIDIA Nemotron 3 Super: 120B Parameter Hybrid MoE Model for Agentic AI

Qwen3.6-35B-A3B: Sparse MoE Vision-Language Model with 3B Active Parameters

Alibaba Unveils Qwen3-Max-Thinking, a Trillion-Parameter Reasoning Model