Microsoft AI Releases Fara-7B: An Efficient Agentic Model for Computer Use

Microsoft Research has unveiled Fara-7B, a 7 billion parameter agentic small language model designed to directly interact with computers; it predicts mouse and keyboard actions from screenshots, reducing latency and maintaining data privacy. This open-weight agent operates on user devices, representing a shift from cloud-dependent AI interactions.

Why This Matters

Current LLM-based agents often rely on complex, server-side infrastructure, increasing latency and costs. Ideal models would execute locally, preserving user data and responsiveness. The high cost of creating high-quality training data for these agents—approximately $1 per verified trajectory using premium models—is a significant barrier to entry, and Fara-7B addresses this through synthetic data generation.

Key Insights

FaraGen Data Engine: Generates and filters 145,603 web interaction trajectories.
Multimodal Decoder-Only Model: Fara-7B is built on Qwen2.5-VL-7B, consuming screenshots and text context to output actions.
Cost Efficiency: Fara-7B costs approximately $0.025 per task on WebVoyager, compared to $0.30 for similar systems using GPT-5 class models.

Working Example

# Example of a simplified tool call output from Fara-7B
tool_call = {
    "tool": "left_click",
    "arguments": {
        "x": 100,
        "y": 200
    }
}

print(f"Executing tool call: {tool_call}")
# This would translate to a click at pixel coordinates (100, 200) on the screen.

Practical Applications

Automated Customer Support: A company could use Fara-7B to automate form filling and data entry tasks for customer service representatives.
Pitfall: Over-reliance on synthetic data may lead to brittle agents that struggle with unexpected website layouts or user behaviors.

References:

https://www.marktechpost.com/2025/11/24/microsoft-ai-releases-fara-7b-an-efficient-agentic-model-for-computer-use/

On This Page

Microsoft AI Releases Fara-7B: An Efficient Agentic Model for Computer Use