Magentic-UI provides an agentic interface designed to handle complex online tasks. Rather than relying on a single bot attempting to navigate pages alone, it employs a team of specialized agents coordinated by an Orchestrator. This approach offers a transparent, controllable window into the automation process. Four distinct agents—WebSurfer, Coder, FileSurfer, and UserProxy—work under your oversight, allowing you to pause, approve, or redirect their progress at any stage.
Magentic-UI is built on Microsoft’s AutoGen Magentic-One framework. Five key agents form the core of this collaborative system:
Orchestrator acts as the lead agent. Powered by a large language model (LLM), it builds task plans in coordination with the user, determines when human input is required, and delegates subtasks to the appropriate specialists.
WebSurfer manages a steerable browser. It clicks, types, scrolls, and navigates pages according to the Orchestrator’s instructions. It essentially functions as an LLM equipped with a mouse and keyboard within a live browser session.
Coder operates within a secure Docker container. It writes and executes Python scripts or shell commands, reporting the outcomes back to the Orchestrator. By design, no code runs outside the isolated container environment.
FileSurfer utilizes Docker and MarkItDown to manage documents. It locates files within the system’s designated directory, converts them into Markdown format, and extracts information to answer specific queries.
UserProxy serves as the "human-in-the-loop" communication channel. It gathers user feedback and relays approval requests from the other agents back to you.
Planning: When you submit a request—via text or image—the system generates a step-by-step plan in plain English. You can modify this plan by adding, removing, or rewriting steps, or by sending follow-up notes to refine the strategy before execution begins.
Execution: The Orchestrator interprets the plan and assigns tasks. It dispatches requests and waits for results. You can monitor progress after every completed step. If the system encounters an obstacle—such as a site being down—the Orchestrator can propose a new plan, which only proceeds after your approval.
Safety: You define the guardrails. You can toggle mandatory approvals for specific operations, such as clicking links or submitting forms. Critical actions remain paused until you provide manual confirmation.
Learning: The system tracks successful outcomes. Over time, it refines its approach, improving its ability to manage future tasks with less manual intervention.
Base Requirements: Docker must be installed and running. Windows users require WSL2, while macOS and Linux support Docker natively.
Python: Version 3.10 or newer is required.
API Keys: You must set the OPENAI_API_KEY environment variable. Alternatively, you can use a config.yaml file to configure custom clients, including Azure OpenAI.
1. PyPI Install (Recommended)
# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install Magentic-UI
pip install magentic-ui
# Start the service. The first run builds Docker images, which may take several minutes.
magentic ui --port 8081
Access the interface by navigating to http://localhost:8081.
2. Build from Source
# Clone the repository
git clone https://github.com/microsoft/magentic-ui.git
cd magentic-ui
# Install dependencies (requires uv and Node.js)
uv venv --python=3.12 .venv
uv sync --all-extras
source .venv/bin/activate
# Build the frontend
cd frontend
npm install -g gatsby-cli
yarn install
yarn build
cd ..
# Launch the application
magentic ui --port 8081
For development with hot reloading:
# Run in a separate terminal
cd frontend
npm run start
Everyday Tasks: You might ask the system to order food from a specific restaurant. The agents will locate the website, navigate to the ordering menu, add the correct items to the cart, and then pause for your final approval before processing the checkout.
Research: Use it to find operating hours for local services, aggregate the latest research papers from sources like Microsoft Research, or locate specific commits within a GitHub repository.
Cross-Platform Workflows: Combine web navigation with data processing. You can instruct the system to scrape data from a website, process that information using Python code, and save the final output—all within a single, supervised session.
O3Cloud: High-Speed Access to China for Overseas Users – 30-Day Free Trial
CloudRocket VPN Promo Code: 10% Discount + Upgraded 400GB/Month Plan
Space Adventure Story Voice Mode: Build an AI-Powered Voice Game
OpenAI’s New Open-Weight Models: gpt-oss-120b & 20b
OpenCut: Free, Open-Source Video Editor (No Watermark, No Subscription)
Turn Google Gemini CLI Into a Standard API Proxy for Any OpenAI Client
BuildAdmin: Vue 3 + ThinkPHP 8 Admin Panel with CRUD Generator
Fay: Build and Deploy Your Own Talking Digital Human for Free
syftr: Optimize Agent Workflows with Pareto Front Search
TypeAgent: Build AI Agents With Structured Memory and Human-in-the-Loop
SuperCoder: A Terminal-Based Coding Assistant for Searching, Editing, and Debugging
Liebao VPN: Download, Install & Use on Android & iOS