TypeAgent is Microsoft's open-source sandbox for developing personal agents that communicate with human-like fluidity while maintaining the reliability of a structured database. By pairing large language models with conventional software logic, it establishes architectural patterns that ensure agents remain safe, efficient, and precise.
Actions utilize defined translation patterns instead of relying solely on repetitive model prompting. By applying these patterns, the system eliminates unnecessary round trips to the LLM.
Memory generates ontologies directly from raw text. This allows the agent to understand the context of past conversations without requiring secondary queries to piece information together.
Planning implements a “tree of thoughts” where human oversight, executable code, and model outputs converge. While the search tree expands to explore possibilities, the system prunes inefficient branches to keep the agent on track.
Actions are confined to discrete categories defined by the specific application. This ensures high-density descriptions with minimal irrelevant "fluff."
Memory organizes data into compact semantic structures designed to fit efficiently within the model's limited attention window.
Planning isolates each node in the search tree to a specific sub-problem, preventing logic from sprawling or losing focus.
Actions empower humans to disambiguate complex requests. Rather than guessing the user's intent, the agent pauses to ask for clarification.
Memory extraction is handled by lightweight models, making the process of turning raw data into logical structures both fast and cost-effective.
Planning orchestrates a pool of resources—including high-reasoning models, specialized scripts, and human participants—to extend a "best-first" search. This allows every component to contribute where it is most effective.
Structured Retrieval-Augmented Generation (Structured RAG)
Structured RAG serves as the system's memory layer. While traditional RAG often falters when asked specific questions like “Which books did we discuss?” or “Where did we leave off on the photo collage?”, Structured RAG retrieves these details with deterministic accuracy. The result is an agent that converses naturally but recalls facts with mechanical precision.
Action-Memory-Plan Integration (AMP)
The AMP architecture integrates actions, memories, and plans into a single continuous loop. For instance, if you add “pickleball Friday 2–3 pm” to your calendar, that action is recorded as a memory. Later, the agent can use that specific memory as a parameter to suggest "scheduling an hour of recovery after the game." This framework has been integrated into a browser extension, allowing websites to register custom actions via JavaScript.
TypeAgent is compatible with Windows, WSL2, and Linux (Ubuntu/Debian). macOS support is currently in development.
You will need Node.js 20+ and pnpm installed.
TypeScript Setup
git clone https://github.com/microsoft/TypeAgent
cd TypeAgent/ts
pnpm i
pnpm run build
TypeAgent Shell
The Electron-based GUI supports voice input and includes built-in multi-agent orchestration, Structured RAG, and the TypeAgent Cache.
Launch the shell: pnpm run shell
TypeAgent CLI
The console interface provides a robust set of debugging commands for developers.
Start an interactive session: pnpm run cli -- interactive
Service Configuration
Configure your API keys by creating a .env file at the project root.
# Azure OpenAI example
AZURE_OPENAI_API_KEY=your_key
AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com
AZURE_OPENAI_RESPONSE_FORMAT=1
# OpenAI example
OPENAI_API_KEY=your_key
OPENAI_ENDPOINT=https://api.openai.com/v1/chat/completions
OPENAI_MODEL=gpt-4o
| Agent Type | Functionality | Tech Stack |
|---|---|---|
| Music Player | Interfaces with Spotify API | TypeScript + Node.js |
| Calendar / Email | Connects to Microsoft Graph | C# .NET |
| Browser Extension | Registers webpage actions | JavaScript |
| VS Code Plugin | Assists with code operations | TypeScript |
| Photo Collage | Performs image compositing | Python + OpenCV |
ReCode: Recursive Code Generation for LLM Agents
Gemini Conversation Timeline: Jump to Any Message Instantly
FireRedTTS‑2: Stream Voice Cloning for Long‑Form Podcasts and Chatbots
LiveMCPBench: Benchmark AI Agents on Real-World MCP Tool Tasks
Duck VPN Review: Stream Netflix & Unblock Social Apps Without Logs
One API Setup Guide: Manage LLM Keys and Access 100+ AI Models
OxyGent: Build Multi-Agent Systems That Learn and Scale Without YAML
BuildAdmin: Vue 3 + ThinkPHP 8 Admin Panel with CRUD Generator
ERPNext Open Source ERP: Installation Guide for Accounting and Inventory
Claude Code SDK for Python: Installation, Quick Start, and API Reference
AppFlowy: Open-Source Notion Alternative With Local Data Control
BiliNote: Convert YouTube and Bilibili Videos Into Markdown Notes