Building Hierarchical AI Agents with Qwen2.5 and Python Tool Execution

A Coding Implementation to Build a Hierarchical Planner AI Agent Using Open-Source LLMs with Tool Execution and Structured Multi-Agent Reasoning

Michal Sutter demonstrates a structured multi-agent architecture utilizing the Qwen2.5-1.5B-Instruct model for complex task decomposition. The system employs a specialized planner agent to break down goals into 3-8 discrete, executable steps.

Why This Matters

While monolithic LLM calls often struggle with complex reasoning and long-tail logic, hierarchical architectures distribute cognitive load across specialized roles. Using a 1.5B parameter model in 4-bit quantization allows for efficient local execution while maintaining the structured JSON output necessary for autonomous tool use and iterative reasoning.

Key Insights

Fact: The system utilizes 4-bit quantization to run the Qwen2.5-1.5B-Instruct model efficiently on standard GPU hardware as of 2026.
Concept: Hierarchical planning decomposes high-level goals into 3-8 independent steps categorized by tools like ‘llm’ or ‘python’.
Tool: The Python execution environment uses io.StringIO and contextlib.redirect_stdout to safely capture output from dynamically generated agent code.

Working Examples

Loading the Qwen2.5 model with 4-bit quantization for efficient agentic reasoning.

MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    device_map="auto",
    torch_dtype="auto",
    load_in_4bit=True,
)

Robust JSON extraction logic to handle imperfect model outputs during the planning phase.

def extract_json_block(text: str) -> Optional[Any]:
    fenced = re.search(r"```json\s*(.*?)\s*```", text, flags=re.DOTALL | re.IGNORECASE)
    if fenced:
        cand = fenced.group(1).strip()
        try:
            return json.loads(cand)
        except:
            pass
    # ... fallback to scanning for braces

Practical Applications

Logistics Coordination: A multi-agent system where a planner decomposes tasks for routing and inventory agents. Pitfall: Failing to pass enough context between steps leads to execution silos.
Automated Data Analysis: Using the Python tool for dynamic simulations and calculations. Pitfall: Unconstrained code execution without safety wrappers can lead to environment crashes.

References:

https://www.marktechpost.com/2026/02/27/a-coding-implementation-to-build-a-hierarchical-planner-ai-agent-using-open-source-llms-with-tool-execution-and-structured-multi-agent-reasoning/

On This Page

A Coding Implementation to Build a Hierarchical Planner AI Agent Using Open-Source LLMs with Tool Execution and Structured Multi-Agent Reasoning

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Building Enterprise AI Governance with OpenClaw Gateway and Policy Engines

Building Production-Ready Agentic Workflows with AgentScope and ReAct Agents

Building Multi-Agent Data Analysis Pipelines with Google ADK