Skip to main content

On This Page

Building ClauseGuard: A 5-Agent AI Pipeline for Legal Contract Risk Analysis

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

ClauseGuard — Technical Walkthrough

Muhammad Bin Murtza engineered ClauseGuard to decompose complex legal documents into structured risk reports using a specialized multi-agent pipeline. The system runs Qwen 2.5 1.5B on AMD MI300X hardware, achieving deterministic results for high-stakes legal reasoning through focused model orchestration.

Why This Matters

Moving from a monolithic prompt to a modular 5-agent pipeline solves the inconsistency issues prevalent in smaller LLMs performing multi-step reasoning. By enforcing Pydantic models and a temperature of 0.0, the system transforms unstructured legalese into machine-readable data, proving that 1.5B parameter models can handle professional-grade analysis if the architecture provides sufficient task isolation and error handling.

Key Insights

  • A 5-agent pipeline consisting of an Extractor, Classifier, Risk Scorer, Translator, and Reporter prevents shallow analysis by focusing each model call on a narrow task.
  • Self-hosting Qwen 2.5 1.5B on AMD MI300X with vLLM provides a low-latency, OpenAI-compatible backend for private and efficient legal document processing.
  • Strict enum-based data models define 12 clause types—including NDA, Liability Cap, and Indemnification—to ensure consistent classification across varied contract formats.
  • Error isolation via asyncio.wait_for and a 120-second timeout prevents pipeline crashes, implementing fallback scoring to avoid misleading ‘no issues found’ results during API interruptions.
  • Prompt engineering using concrete decision trees and severity rubrics (e.g., CRITICAL for IP covering personal work) produces more consistent risk judgment than abstract instructions.

Practical Applications

  • Automated Negotiation: Utilizing the Translator agent to generate safer clause rewrites and ready-to-send emails for high-risk findings. Pitfall: Silent API failures leading to empty reports; mitigated by pre-flight connectivity checks and zero-clause detection.
  • Legal Document Triage: Handling PDF, DOCX, and TXT files with PyMuPDF and python-docx to extract text before multi-agent processing. Pitfall: Scanned PDFs without extractable text; addressed by using pdfplumber as a secondary fallback layer.

References:

Continue reading

Next article

CommitAI: Building a Local Offline Git Assistant with Gemma 4 and Ollama

Related Content