MagenticLite: An agentic experience optimized for small models

Compatibilità
Salva(0)
Condividi

At a glance

  • MagenticLite is an agentic application that works across both the browser and local file system in a single workflow. Built as the next generation of Magentic-UI, it combines a redesigned app with a harness optimized for small models.
  • MagenticBrain and Fara1.5 are small models designed for orchestration and computer-use tasks, respectively. Fara1.5 is the next iteration of Fara and delivers measurable gains on real-world browser tasks.
  • Together, these releases explore how far agentic performance can be pushed with smaller models, codesigned tools, and an optimized execution harness.

Today, Microsoft Research AI Frontiers releases MagenticLite (opens in new tab), an experimental agentic application designed for small models. As the next generation of Magentic-UI, it works across the browser and local file system in a single workflow.

MagenticLite is powered by two purpose-built models: MagenticBrain, for reasoning, delegation, and terminal use, and Fara1.5, a computer-use model family for browser-based tasks. The three components were designed to work together as a single system. The result is an agent that runs efficiently, keeps data on the user’s machine, and supports a broad range of agentic tasks. It also points toward a broader goal: capable agents that can run directly on users’ hardware.

The project is built around a key research bet: that agentic capability depends on tool orchestration and action rather than knowledge alone. That insight makes it possible to use smaller models while still enabling a broad range of agentic tasks at a fraction of the cost.

MagenticLite also reflects how we approach agentic AI end-to-end—from training data and model design to orchestration, interaction design, and human oversight throughout the experience.

Figure 1. One experience, three components: MagenticLite, MagenticBrain, and Fara1.5.

Included in this release

MagenticLite (opens in new tab)

The next generation of Magentic-UI, our experimental agentic experience, is powered by an agent harness rebuilt for small models, with an updated user interface informed by community feedback. It works across users’ browsers and local file systems in a single workflow.

MagenticBrain (opens in new tab)

MagenticBrain is MagenticLite’s planner, coder, and delegator in one. It turns vague requests into concrete plans, selects the right tool or subagent for each step, writes code when needed, and recovers should something break mid-task. 

Fara1.5

The next generation of our computer-use model family, Fara1.5 comes  in three sizes, with a flagship 9-billion-parameter model for most use cases. Fara1.5 sets new state-of-the-art (SOTA) results among small computer-use models and nearly doubles Fara-7B’s performance on web navigation, with sharper handling of forms, credentialed sites, and long-running tasks.

Each component is useful on its own, but they work best together. Codesigning the app, models, and the harness enables capable and reliable agentic performance at this scale.

Our research approach: Doing more with less

We started with a simple question: what does it take to make a small model genuinely good at agentic tasks? The answer spanned the full lifecycle—data generation, training objectives, model design, and orchestration had to be redesigned together rather than in isolation.

We identified requirements from real-world use cases like filling out forms, conducting browser research, and managing files locally, and built an evaluation dataset around them. Standard benchmarks capture part of the picture, but they are not always a direct measure of real-world usefulness. Scenario-based evaluations complemented those benchmarks and became a key signal for iterative improvement across both the models and the harness, as shown in Figure 2.

Figure 2. An iterative process for building agentic systems involves defining success criteria, evaluating performance, and refining the models or system design (or both). Then repeat.

For the user experience, we retained key elements from Magentic-UI, including visibility into the agent’s reasoning and actions, the ability for users to take direct control, and explicit approval at critical points. Based on recent user studies, we also made MagenticLite easier to learn and collaborate with through updated browser and chat views, designed to make it easier for users to understand the agent’s actions and intervene when needed. This is illustrated in Figure 3.

Figure 3. MagenticLite’s interface includes updated browser and chat views designed to make it easier to understand agent actions and intervene when needed.

Spotlight: AI-POWERED EXPERIENCE

Microsoft research copilot experience

Discover more about research at Microsoft through our AI-powered experience

System components

Fara1.5: A computer-use model that outperforms its weight class

Fara1.5 is the next generation of our computer-use model family, which is available in three sizes, with a flagship 9B model recommended for most use cases. Fara1.5 achieves new SOTA performance among small computer-use models and nearly doubles Fara-7B’s performance on web navigation, with better handling of forms, credentialed sites, and long-running tasks.

Last November, we released Fara-7B, a small agentic model built for completing tasks in a web browser. It was trained using a novel synthetic data generation engine that enabled best-in-class performance. Fara1.5 is the next step in that bet: a family of three models (4B, 9B, 27B) based on Qwen 3.5, designed to close the gaps we saw in the prior release.

What’s new

State-of-the-art results. On the popular Online-Mind2Web benchmark, which contains 300 tasks across widely used web domains, Fara1.5 sets new SOTA results for models in its size class. Fara1.5 outperforms all similarly sized models and nearly doubles the performance of Fara-7B. The larger Fara1.5-27B variant achieves more than 90% performance on the same benchmark.

Figure 4. On the OnlineMind2Web benchmark, Fara‑1.5-9B achieves state-of-the-art performance among models in its size class and substantially outperforms prior models. 

Improved user experience. In addition to improvements on benchmarks, we improved the user experience of Fara1.5. Users should observe stronger performance on everyday tasks like filling out forms, handling logins for credentialed sites, and booking appointments. These improvements are driven by the next evolution of our FaraGen data generation pipeline. Alongside training on live websites, we also trained the model on highly realistic synthetic environments designed to simulate scenarios like logins and irreversible actions.

A native action space tuned for long-running tasks. Beyond clicks and keyboard actions, Fara1.5 has built-in tools to store key information in its context across hundreds of steps and ask the user for permission or preferences when needed, helping it stay coherent on tasks that span many minutes of real work.

Recalibrated critical points. Fara-7B was trained to detect critical points for activities like transactions, login flows, or irreversible submissions and flag them. In Fara1.5, we refined our design around critical points based on our learnings from real use, so safety triggers still occur when they should but do not block

Recapiti
stclarke