AI Agents
112 articles in this category (Page 5 of 5)
xAI’s Grok 4.1 Achieves Top Ranking on LMArena with 1483 Elo, Signaling Advances in LLM Preference
xAI’s Grok 4.1 surpasses previous models and competitors, achieving a 64.78% preference rate in A/B testing and securing the top two positions on the LMArena Text Arena leaderboard.
How to Design an Advanced Multi-Agent Reasoning System with spaCy Featuring Planning, Reflection, Memory, and Knowledge Graphs
Build a multi-agent AI system with spaCy that extracts entities, constructs knowledge graphs, and learns from experience using reflection and memory modules.
How to Build a Fully Self-Verifying Data Operations AI Agent Using Local Hugging Face Models for Automated Planning, Execution, and Testing
Build a self-verifying DataOps AI agent using Microsoft’s Phi-2 model for automated planning, execution, and testing with local Hugging Face models.
Neural Memory Agents with Differentiable Memory, Meta-Learning, and Experience Replay for Continual Adaptation
A comprehensive guide to building neural memory agents that leverage differentiable memory, meta-learning, and experience replay to adapt to dynamic environments without catastrophic forgetting.
Building an Autonomous Wet-Lab Protocol Planner with Salesforce CodeGen for Agentic Experiment Design and Safety Optimization
A detailed tutorial on creating an AI-driven system for automating lab protocols, reagent validation, and safety checks using Salesforce CodeGen and Python.
Magentic Marketplace: Open-source platform to study agentic markets
Microsoft Research introduces Magentic Marketplace, an open-source simulation environment to explore agentic market dynamics, including agent interactions, consumer welfare, and systemic biases in AI-driven markets.
AI Agents: The Future of Unified Interfaces in Software Development
This article explores how AI agents are poised to revolutionize software development by unifying disparate tools into a single interface, reducing context switching, and emphasizing the critical role of platform engineering teams in enabling this shift.