TypeAgent is Microsoft's open-source sandbox for developing personal agents that communicate with human-like fluidity while maintaining the reliability of a structured database. By pairing large language models with conventional software logic, it establishes architectural patterns that ensure agents remain safe, efficient, and precise.
Actions utilize defined translation patterns instead of relying solely on repetitive model prompting. By applying these patterns, the system eliminates unnecessary round trips to the LLM.
Memory generates ontologies directly from raw text. This allows the agent to understand the context of past conversations without requiring secondary queries to piece information together.
Planning implements a “tree of thoughts” where human oversight, executable code, and model outputs converge. While the search tree expands to explore possibilities, the system prunes inefficient branches to keep the agent on track.
Actions are confined to discrete categories defined by the specific application. This ensures high-density descriptions with minimal irrelevant "fluff."
Memory organizes data into compact semantic structures designed to fit efficiently within the model's limited attention window.
Planning isolates each node in the search tree to a specific sub-problem, preventing logic from sprawling or losing focus.
Actions empower humans to disambiguate complex requests. Rather than guessing the user's intent, the agent pauses to ask for clarification.
Memory extraction is handled by lightweight models, making the process of turning raw data into logical structures both fast and cost-effective.
Planning orchestrates a pool of resources—including high-reasoning models, specialized scripts, and human participants—to extend a "best-first" search. This allows every component to contribute where it is most effective.
Structured Retrieval-Augmented Generation (Structured RAG)
Structured RAG serves as the system's memory layer. While traditional RAG often falters when asked specific questions like “Which books did we discuss?” or “Where did we leave off on the photo collage?”, Structured RAG retrieves these details with deterministic accuracy. The result is an agent that converses naturally but recalls facts with mechanical precision.
Action-Memory-Plan Integration (AMP)
The AMP architecture integrates actions, memories, and plans into a single continuous loop. For instance, if you add “pickleball Friday 2–3 pm” to your calendar, that action is recorded as a memory. Later, the agent can use that specific memory as a parameter to suggest "scheduling an hour of recovery after the game." This framework has been integrated into a browser extension, allowing websites to register custom actions via JavaScript.
TypeAgent is compatible with Windows, WSL2, and Linux (Ubuntu/Debian). macOS support is currently in development.
You will need Node.js 20+ and pnpm installed.
TypeScript Setup
git clone https://github.com/microsoft/TypeAgent
cd TypeAgent/ts
pnpm i
pnpm run build
TypeAgent Shell
The Electron-based GUI supports voice input and includes built-in multi-agent orchestration, Structured RAG, and the TypeAgent Cache.
Launch the shell: pnpm run shell
TypeAgent CLI
The console interface provides a robust set of debugging commands for developers.
Start an interactive session: pnpm run cli -- interactive
Service Configuration
Configure your API keys by creating a .env file at the project root.
# Azure OpenAI example
AZURE_OPENAI_API_KEY=your_key
AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com
AZURE_OPENAI_RESPONSE_FORMAT=1
# OpenAI example
OPENAI_API_KEY=your_key
OPENAI_ENDPOINT=https://api.openai.com/v1/chat/completions
OPENAI_MODEL=gpt-4o
| Agent Type | Functionality | Tech Stack |
|---|---|---|
| Music Player | Interfaces with Spotify API | TypeScript + Node.js |
| Calendar / Email | Connects to Microsoft Graph | C# .NET |
| Browser Extension | Registers webpage actions | JavaScript |
| VS Code Plugin | Assists with code operations | TypeScript |
| Photo Collage | Performs image compositing | Python + OpenCV |
YPrompt Review: Build Better AI Prompts With This Smart Tool
Liebao VPN Free Trial: 4K Streaming & Easy Setup on Any Device
Gemini-CLI-UI: A Web Interface for the Google Gemini CLI Coding Assistant
UTCP Explained: A Universal Tool Calling Protocol for APIs, LLMs, and Beyond
Firecrawl API: Converting Any Website Into Clean Markdown for LLMs
Immich Setup Guide: How to Self-Host Your Own Google Photos Alternative
Anyi VPN Review: Free 365-Day Trial with No Data Caps or Ads
GraphGen: Build Knowledge Graphs to Generate Smarter Training Data
Perspective: Interactive Data Visualization for the Browser and Python
Cnchar: A Lightweight JavaScript Library for Pinyin, Stroke Order & Idioms
How to Add Missing Games to Shendeng VPN’s Library
Shendeng VPN: Genuine Unlimited Data & High-Speed Gaming Acceleration