TypeAgent: Build AI Agents With Structured Memory and Human-in-the-Loop

5月20日 Published inAI Agent Tools

TypeAgent is Microsoft's open-source sandbox for developing personal agents that communicate with human-like fluidity while maintaining the reliability of a structured database. By pairing large language models with conventional software logic, it establishes architectural patterns that ensure agents remain safe, efficient, and precise.

Anchoring Models in a Logical Framework

Actions utilize defined translation patterns instead of relying solely on repetitive model prompting. By applying these patterns, the system eliminates unnecessary round trips to the LLM.

Memory generates ontologies directly from raw text. This allows the agent to understand the context of past conversations without requiring secondary queries to piece information together.

Planning implements a “tree of thoughts” where human oversight, executable code, and model outputs converge. While the search tree expands to explore possibilities, the system prunes inefficient branches to keep the agent on track.

Managing Information Density Through Structure

Actions are confined to discrete categories defined by the specific application. This ensures high-density descriptions with minimal irrelevant "fluff."

Memory organizes data into compact semantic structures designed to fit efficiently within the model's limited attention window.

Planning isolates each node in the search tree to a specific sub-problem, preventing logic from sprawling or losing focus.

Enabling Seamless Collaboration

Actions empower humans to disambiguate complex requests. Rather than guessing the user's intent, the agent pauses to ask for clarification.

Memory extraction is handled by lightweight models, making the process of turning raw data into logical structures both fast and cost-effective.

Planning orchestrates a pool of resources—including high-reasoning models, specialized scripts, and human participants—to extend a "best-first" search. This allows every component to contribute where it is most effective.

The Technical Spine

Structured Retrieval-Augmented Generation (Structured RAG)

Structured RAG serves as the system's memory layer. While traditional RAG often falters when asked specific questions like “Which books did we discuss?” or “Where did we leave off on the photo collage?”, Structured RAG retrieves these details with deterministic accuracy. The result is an agent that converses naturally but recalls facts with mechanical precision.

Action-Memory-Plan Integration (AMP)

The AMP architecture integrates actions, memories, and plans into a single continuous loop. For instance, if you add “pickleball Friday 2–3 pm” to your calendar, that action is recorded as a memory. Later, the agent can use that specific memory as a parameter to suggest "scheduling an hour of recovery after the game." This framework has been integrated into a browser extension, allowing websites to register custom actions via JavaScript.

Getting TypeAgent Running

TypeAgent is compatible with Windows, WSL2, and Linux (Ubuntu/Debian). macOS support is currently in development.

You will need Node.js 20+ and pnpm installed.

TypeScript Setup

git clone https://github.com/microsoft/TypeAgent
cd TypeAgent/ts

pnpm i

pnpm run build

TypeAgent Shell

The Electron-based GUI supports voice input and includes built-in multi-agent orchestration, Structured RAG, and the TypeAgent Cache.

Launch the shell: pnpm run shell

TypeAgent CLI

The console interface provides a robust set of debugging commands for developers.

Start an interactive session: pnpm run cli -- interactive

Service Configuration

Configure your API keys by creating a .env file at the project root.

# Azure OpenAI example
AZURE_OPENAI_API_KEY=your_key
AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com
AZURE_OPENAI_RESPONSE_FORMAT=1

# OpenAI example
OPENAI_API_KEY=your_key
OPENAI_ENDPOINT=https://api.openai.com/v1/chat/completions
OPENAI_MODEL=gpt-4o

Sample Agents

Agent Type Functionality Tech Stack
Music Player Interfaces with Spotify API TypeScript + Node.js
Calendar / Email Connects to Microsoft Graph C# .NET
Browser Extension Registers webpage actions JavaScript
VS Code Plugin Assists with code operations TypeScript
Photo Collage Performs image compositing Python + OpenCV