Allen Institute for AI (AI2) Introduces Olmo 3: Open Source 7B/32B LLMs with 65K Context Window
These articles are AI-generated summaries. Please check the original sources for full details.
Allen Institute for AI (AI2) Introduces Olmo 3: An Open Source 7B and 32B LLM Family Built on the Dolma 3 and Dolci Stack
Allen Institute for AI (AI2) has released Olmo 3, a family of open-source large language models (LLMs) with 7B and 32B parameters, each featuring a 65,536 token context window. The models are trained using the Dolma 3 data suite and Dolci stack, enabling end-to-end transparency in model development.
Why This Matters
Training LLMs with extended context windows (e.g., 65K tokens) demands significant computational resources and high-quality data curation. While ideal models would balance scale, context length, and performance, real-world training often faces bottlenecks in data quality and hardware costs. Olmo 3 addresses this by combining staged training pipelines with open-source data and code, reducing ambiguity in reproducibility and enabling controlled experimentation.
Key Insights
- “65,536 token context in Olmo 3 (2025)”: Both 7B and 32B variants support this length via Dolma 3’s staged training.
- “Staged training for long-context models”: Dolma 3 Mix (5.9T tokens), Dolmino (100B tokens), and Longmino (50B/100B tokens) enable progressive pre-training and context extension.
- “Dolci stack for post-training”: Used by AI2 to fine-tune variants like Olmo 3-Instruct and Olmo 3-Think, aligning with task-specific benchmarks.
Practical Applications
- Use Case: Research institutions using Olmo 3-Base 32B for long-context reasoning tasks, such as scientific document analysis.
- Pitfall: Underestimating the computational cost of training Longmino subsets, which require 256 H100 GPUs for the 32B variant.
References:
Continue reading
Next article
Announcing the updated AWS Well-Architected Generative AI Lens
Related Content
SETA: Open Source Reinforcement Learning Environments for Terminal Agents
SETA introduces a new open-source toolkit and environment stack achieving state-of-the-art results on Terminal Bench, with 46.5% accuracy on version 2.0.
OpenMind OM1: Building an Open Source Operating System for Humanoid Robots
Jan Liphardt introduces OM1, an open-source robotic OS that leverages large language models for data fusion and utilizes $1,250 hardware components with 10,000-hour durability to enable human-centric robot interactions, shifting the focus from complex motor tasks like onion chopping to social engagement and spatial understanding.
Liquid AI Releases LFM2-ColBERT-350M: A Compact Late Interaction Model for Multilingual Cross-Lingual Retrieval
Liquid AI introduces LFM2-ColBERT-350M, a 350M-parameter late interaction retriever optimized for multilingual and cross-lingual search, offering high accuracy and fast inference speeds.