Large language models

21 articles in this category

AI NewsAI EngineeringLarge Language Models

DeepSeek-V3: Scaling 671B MoE Models with FP8 Precision and R1 Distillation

DeepSeek-V3 achieves GPT-4o level performance with a 671B parameter MoE architecture activating only 37B parameters per token.

May 20, 2026

AI NewsAI InfrastructureLarge Language Models

Moonshot AI Releases FlashKDA: 2.22x Faster Prefill for Kimi Delta Attention

Moonshot AI open-sources FlashKDA, a CUTLASS-based kernel delivering up to 2.22x prefill speedups for Kimi Delta Attention on NVIDIA H20 GPUs.

Apr 30, 2026

AI NewsVoice AILarge Language Models

xAI Launches grok-voice-think-fast-1.0: Setting a New Standard for Full-Duplex Voice AI

xAI's new grok-voice-think-fast-1.0 tops the τ-voice Bench with a 67.3% score, outperforming Gemini 3.1 and GPT Realtime 1.5 in complex, real-world voice tasks.

Apr 25, 2026

AI NewsLarge Language ModelsML & Data Engineering

OpenAI's Open Responses Specification Unifies Agentic LLM Workflows

OpenAI's Open Responses standardizes agentic AI workflows, reducing API fragmentation and enabling seamless transitions between proprietary and open-source models with a unified specification.

Feb 2, 2026

AI NewsLarge Language ModelsML & Data Engineering

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models

Google DeepMind researchers introduce ATLAS, a set of scaling laws for multilingual language models, revealing that doubling the number of languages requires a 1.18× increase in model size and 1.66× increase in total training data.

Jan 29, 2026

AI NewsLarge language modelsEdge

Google Releases Gemma 3 270M Variant Optimized for Function Calling on Mobile and Edge Devices

Google’s FunctionGemma, a 270M parameter model, achieves 85% accuracy in mobile action tasks after fine-tuning, enabling on-device AI agents.

Jan 26, 2026

AI NewsLarge language modelsMachine Learning

MIT's Recursive Language Models Improve Performance on Long-Context Tasks

MIT researchers achieved 100x longer context handling with Recursive Language Models (RLMs), utilizing a programming environment for iterative processing.

Jan 20, 2026

AI NewsLarge language modelsInterpretability

Google Releases Gemma Scope 2 to Deepen Understanding of LLM Behavior

Google’s Gemma Scope 2 suite of tools enhances LLM interpretability, addressing crucial safety concerns like jailbreaks and hallucinations.

Jan 12, 2026

AI NewsLarge language modelsAgents

Intel DeepMath Improves LLM Math Reasoning with Python Executors

Intel’s DeepMath agent, built on Qwen3-Thinking, reduces LLM output length by up to 66% and improves accuracy on math problems by using Python code execution.

Jan 5, 2026

AI NewsData PrivacyLarge language models

Microsoft Research Enforces LLM Privacy with PrivacyChecker and CI-CoT+CI-RL

Microsoft's new PrivacyChecker reduces LLM information leakage by 75-80% on benchmarks, while CI-CoT+CI-RL balances privacy and utility.

Jan 2, 2026

AI NewsGemini ModelsLarge Language Models

Gemini 3 Flash: Frontier Intelligence Built for Speed

Gemini 3 Flash delivers frontier intelligence with speed at a fraction of the cost, processing over 1T tokens per day.

Dec 17, 2025

AI NewsMachine LearningLarge Language Models

NVIDIA Introduces Orchestrator-8B: Reinforcement Learning Controller for Tool and Model Orchestration

Orchestrator-8B achieves 30% lower cost and 2.5x faster execution than GPT-5 on benchmark tasks.

Nov 28, 2025

AI NewsLarge Language ModelsAI Benchmarks

AI Model Showdown: Grok 4 vs ChatGPT (GPT-5.1) vs Gemini 3 Pro vs Claude Opus 4.5 in 2025

In 2025, the AI landscape features a crowded field of leading models, with Gemini 3 Pro achieving a 37.5% score on the Humanity’s Last Exam.

Nov 27, 2025

AI NewsLarge language modelsBenchmark

Olmo 3 Release Provides Full Transparency Into Model Development and Training

Allen Institute's Olmo 3-Think (32B) matches Qwen 3 and Gemma 3 in reasoning benchmarks, offering full model lifecycle transparency.

Nov 22, 2025

AI NewsLarge language modelsBenchmark

CodeClash Benchmarks LLMs through Multi-Round Coding Competitions

CodeClash benchmarks LLMs in 1680 multi-round coding tournaments, revealing no single model dominates across all challenges.

Nov 10, 2025

AI NewsAIML & Data Engineering

Apple Releases Pico-Banana-400K Dataset for Text-Guided Image Editing

Apple introduces Pico-Banana-400K, a dataset of 400,000 images for advancing text-guided image editing models, generated using Google's Nano-Banana and filtered with Gemini-2.5-Pro.

Nov 3, 2025

AI NewsLarge language modelsML & Data Engineering

NVIDIA Unveils OmniVinci: A Research-Focused Multimodal LLM

NVIDIA Research has released OmniVinci, a research-only large language model designed for cross-modal understanding of text, vision, audio, and robotics data. It demonstrates strong performance with a smaller training dataset compared to competitors, but its non-commercial license has sparked debate within the AI community.

Oct 28, 2025

AI NewsLarge language models

Anthropic Launches 'Skills' for Enhanced Claude Customization

Anthropic introduces 'Skills,' a new feature enabling developers to extend Claude's capabilities with modular, reusable task components for specialized applications.

Oct 25, 2025

AI NewsML & Data EngineeringLarge language models

DeepSeek AI Introduces DeepSeek-OCR: A Novel Approach to Context Compression for LLMs

DeepSeek AI has released DeepSeek-OCR, an open-source system leveraging optical 2D mapping for efficient compression of long text, potentially revolutionizing how large language models handle extensive inputs.

Oct 22, 2025

AI NewsLarge language modelsML & Data Engineering

Google Launches LLM-Evalkit for Data-Driven Prompt Engineering

Google introduces LLM-Evalkit, an open-source framework on Vertex AI SDKs, to standardize and measure prompt engineering for large language models, promoting a data-driven workflow and collaboration.

Oct 20, 2025

AI NewsOpen SourceLarge Language Models

ALTK: Open-Source Toolkit Boosts Agent Reliability and Robustness

IBM Research introduces ALTK, an open-source toolkit to enhance the reliability and robustness of AI agents powered by large language models. ALTK provides modular components addressing various lifecycle stages, integrating with tools like ContextForge MCP Gateway and Langflow.

Feb 9, 2021