Skip to main content
← All Tags

Large language models

21 articles in this category

AI NewsAI EngineeringLarge Language Models

DeepSeek-V3: Scaling 671B MoE Models with FP8 Precision and R1 Distillation

DeepSeek-V3 achieves GPT-4o level performance with a 671B parameter MoE architecture activating only 37B parameters per token.

Read more
AI NewsAI InfrastructureLarge Language Models

Moonshot AI Releases FlashKDA: 2.22x Faster Prefill for Kimi Delta Attention

Moonshot AI open-sources FlashKDA, a CUTLASS-based kernel delivering up to 2.22x prefill speedups for Kimi Delta Attention on NVIDIA H20 GPUs.

Read more
AI NewsVoice AILarge Language Models

xAI Launches grok-voice-think-fast-1.0: Setting a New Standard for Full-Duplex Voice AI

xAI's new grok-voice-think-fast-1.0 tops the τ-voice Bench with a 67.3% score, outperforming Gemini 3.1 and GPT Realtime 1.5 in complex, real-world voice tasks.

Read more
AI NewsLarge Language ModelsML & Data Engineering

OpenAI's Open Responses Specification Unifies Agentic LLM Workflows

OpenAI's Open Responses standardizes agentic AI workflows, reducing API fragmentation and enabling seamless transitions between proprietary and open-source models with a unified specification.

Read more
AI NewsLarge Language ModelsML & Data Engineering

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models

Google DeepMind researchers introduce ATLAS, a set of scaling laws for multilingual language models, revealing that doubling the number of languages requires a 1.18× increase in model size and 1.66× increase in total training data.

Read more
AI NewsLarge language modelsEdge

Google Releases Gemma 3 270M Variant Optimized for Function Calling on Mobile and Edge Devices

Google’s FunctionGemma, a 270M parameter model, achieves 85% accuracy in mobile action tasks after fine-tuning, enabling on-device AI agents.

Read more
AI NewsLarge language modelsMachine Learning

MIT's Recursive Language Models Improve Performance on Long-Context Tasks

MIT researchers achieved 100x longer context handling with Recursive Language Models (RLMs), utilizing a programming environment for iterative processing.

Read more
AI NewsLarge language modelsInterpretability

Google Releases Gemma Scope 2 to Deepen Understanding of LLM Behavior

Google’s Gemma Scope 2 suite of tools enhances LLM interpretability, addressing crucial safety concerns like jailbreaks and hallucinations.

Read more
AI NewsLarge language modelsAgents

Intel DeepMath Improves LLM Math Reasoning with Python Executors

Intel’s DeepMath agent, built on Qwen3-Thinking, reduces LLM output length by up to 66% and improves accuracy on math problems by using Python code execution.

Read more
AI NewsData PrivacyLarge language models

Microsoft Research Enforces LLM Privacy with PrivacyChecker and CI-CoT+CI-RL

Microsoft's new PrivacyChecker reduces LLM information leakage by 75-80% on benchmarks, while CI-CoT+CI-RL balances privacy and utility.

Read more
AI NewsGemini ModelsLarge Language Models

Gemini 3 Flash: Frontier Intelligence Built for Speed

Gemini 3 Flash delivers frontier intelligence with speed at a fraction of the cost, processing over 1T tokens per day.

Read more
AI NewsMachine LearningLarge Language Models

NVIDIA Introduces Orchestrator-8B: Reinforcement Learning Controller for Tool and Model Orchestration

Orchestrator-8B achieves 30% lower cost and 2.5x faster execution than GPT-5 on benchmark tasks.

Read more
AI NewsLarge Language ModelsAI Benchmarks

AI Model Showdown: Grok 4 vs ChatGPT (GPT-5.1) vs Gemini 3 Pro vs Claude Opus 4.5 in 2025

In 2025, the AI landscape features a crowded field of leading models, with Gemini 3 Pro achieving a 37.5% score on the Humanity’s Last Exam.

Read more
AI NewsLarge language modelsBenchmark

Olmo 3 Release Provides Full Transparency Into Model Development and Training

Allen Institute's Olmo 3-Think (32B) matches Qwen 3 and Gemma 3 in reasoning benchmarks, offering full model lifecycle transparency.

Read more
AI NewsLarge language modelsBenchmark

CodeClash Benchmarks LLMs through Multi-Round Coding Competitions

CodeClash benchmarks LLMs in 1680 multi-round coding tournaments, revealing no single model dominates across all challenges.

Read more
AI NewsAIML & Data Engineering

Apple Releases Pico-Banana-400K Dataset for Text-Guided Image Editing

Apple introduces Pico-Banana-400K, a dataset of 400,000 images for advancing text-guided image editing models, generated using Google's Nano-Banana and filtered with Gemini-2.5-Pro.

Read more
AI NewsLarge language modelsML & Data Engineering

NVIDIA Unveils OmniVinci: A Research-Focused Multimodal LLM

NVIDIA Research has released OmniVinci, a research-only large language model designed for cross-modal understanding of text, vision, audio, and robotics data. It demonstrates strong performance with a smaller training dataset compared to competitors, but its non-commercial license has sparked debate within the AI community.

Read more
AI NewsLarge language models

Anthropic Launches 'Skills' for Enhanced Claude Customization

Anthropic introduces 'Skills,' a new feature enabling developers to extend Claude's capabilities with modular, reusable task components for specialized applications.

Read more
AI NewsML & Data EngineeringLarge language models

DeepSeek AI Introduces DeepSeek-OCR: A Novel Approach to Context Compression for LLMs

DeepSeek AI has released DeepSeek-OCR, an open-source system leveraging optical 2D mapping for efficient compression of long text, potentially revolutionizing how large language models handle extensive inputs.

Read more
AI NewsLarge language modelsML & Data Engineering

Google Launches LLM-Evalkit for Data-Driven Prompt Engineering

Google introduces LLM-Evalkit, an open-source framework on Vertex AI SDKs, to standardize and measure prompt engineering for large language models, promoting a data-driven workflow and collaboration.

Read more
AI NewsOpen SourceLarge Language Models

ALTK: Open-Source Toolkit Boosts Agent Reliability and Robustness

IBM Research introduces ALTK, an open-source toolkit to enhance the reliability and robustness of AI agents powered by large language models. ALTK provides modular components addressing various lifecycle stages, integrating with tools like ContextForge MCP Gateway and Langflow.

Read more