DeepSeek AI Releases DeepSeekMath-V2: The Open Weights Maths Model That Scored 118/120 on Putnam 2024

DeepSeek AI has released DeepSeekMath-V2, a 685B-parameter model that achieved 118 out of 120 points on Putnam 2024. The model uses self-verifying theorem proving to address gaps in prior AI math systems.

Why This Matters

Traditional math models reward only final answers, risking flawed reasoning that coincidentally produces correct results. DeepSeekMath-V2 prioritizes proof quality over answer accuracy, addressing structural flaws in competitions like the Putnam, where rigorous logic is essential. Human-labeled proofs showed that 20% of high-scoring AI answers contained critical reasoning errors, highlighting the cost of relying on final-answer metrics.

Key Insights

“685B parameter model, 2025”: DeepSeekMath-V2 is built on DeepSeek-V3.2-Exp-Base and runs as a mixture of experts.
“Verifier-first training”: The model uses Group Relative Policy Optimization (GRPO) to train a verifier that evaluates proof rigor, not just final scores.
“Meta verification for hallucinations”: A secondary verifier ensures analyses don’t fabricate issues, raising meta-quality scores from 0.85 to 0.96.

Practical Applications

Use Case: Math competition training using DeepSeekMath-V2 for proof generation and verification.
Pitfall: Over-reliance on automated verification without human oversight may miss nuanced logical flaws in complex proofs.

References:

https://www.marktechpost.com/2025/11/28/deepseek-ai-releases-deepseekmath-v2-the-open-weights-maths-model-that-scored-118-120-on-putnam-2024/

On This Page

DeepSeek AI Releases DeepSeekMath-V2: The Open Weights Maths Model That Scored 118/120 on Putnam 2024