MemoryOS: Equip AI Agents with Persistent Recall via a Memory Hierarchy

6月14日 Published inAI Agent Tools

MemoryOS is a specialized memory management system designed for personalized AI agents. It ensures that interactions remain coherent, contextually aware, and tailored to the individual user. The architecture is modeled after the hierarchical memory management found in traditional operating systems. Four primary modules—storage, update, retrieval, and generation—oversee the entire memory lifecycle. Rigorous testing on the LoCoMo benchmark demonstrates its effectiveness: MemoryOS improved F1 scores by an average of 49.11% and BLEU-1 scores by 46.18%.

The objective is to provide AI agents with a robust recall mechanism that mimics an OS, allowing them to retain interaction histories, maintain detailed user profiles, and build evolving knowledge structures.

Architecture Overview

MemoryOS functions through four interconnected modules:

Storage Module: Responsible for persisting data across the various memory tiers.
Update Module: Refreshes stored information and handles the promotion of data between different levels of the hierarchy.
Retrieval Module: Extracts the most relevant context from each specific memory layer based on the current query.
Generation Module: Synthesizes the retrieved context to produce the final agent response.

Three-Tier Memory Structure

  1. Short-Term Memory (STM): Captures and holds the most recent interactions. The capacity for this layer is fully configurable.
  2. Mid-Term Memory (MTM): Consolidates and summarizes data from short-term memory. This layer is managed by a "heat" threshold to determine which information remains accessible.
  3. Long-Term Memory (LPM): Houses permanent user profiles and durable knowledge. This layer is built for long-term persistence across sessions.

How MemoryOS Works

  1. Initialization: Instantiate a MemoryOS instance by defining a user ID, providing an API key, and setting your desired parameters.
  2. Memory Ingestion: User inputs and the corresponding agent responses are stored in short-term memory as QA pairs.
  3. STM to MTM Migration: Once short-term memory reaches its capacity, the update module merges and compresses the content into mid-term memory.
  4. Analysis and LPM Promotion: Mid-term memory evaluates stored content against the heat threshold. Relevant user profile details and significant knowledge updates are then transitioned into long-term memory.
  5. Response Generation: When a user asks a question, the retrieval module pulls context from all layers and provides it to the LLM to generate an informed answer.

MemoryOS Project Structure

memoryos/
├── __init__.py            # Package initializer
├── __pycache__/           # Python cache (auto-generated)
├── long_term.py           # Manages long-term user profile storage
├── memoryos.py            # Core class that coordinates all system components
├── mid_term.py            # Manages mid-term memory and STM consolidation
├── prompts.py             # Prompt templates for LLM-based memory processing
├── retriever.py           # Logic for retrieving info across all memory layers
├── short_term.py          # Manages short-term memory for recent interactions
├── updater.py             # Handles memory updates and tier promotion logic
└── utils.py               # Shared utility functions and helpers

Installation and Usage

MemoryOS requires Python 3.10 or a newer version installed on your local environment.

Installation Steps

  1. Create and activate a Conda environment:
conda create -n MemoryOS python=3.10
conda activate MemoryOS
  1. Install the MemoryOS package from PyPI:
pip install -i https://pypi.org/simple/ MemoryOS-BaiJia

Basic Usage Example

The following script demonstrates how to initialize the system and add basic user information:

import os
from memoryos import Memoryos

# Basic configuration
USER_ID = "demo_user"
ASSISTANT_ID = "demo_assistant"
API_KEY = "YOUR_OPENAI_API_KEY"  # Replace with your actual API key
BASE_URL = ""  # Optional: set if using a custom OpenAI proxy or endpoint
DATA_STORAGE_PATH = "./simple_demo_data"
LLM_MODEL = "gpt-4o-mini"

def simple_demo():
    print("MemoryOS Simple Demo")
    
    # 1. Initialize MemoryOS
    print("Initializing MemoryOS...")
    try:
        memo = Memoryos(
            user_id=USER_ID,
            openai_api_key=API_KEY,
            openai_base_url=BASE_URL,
            data_storage_path=DATA_STORAGE_PATH,
            llm_model=LLM_MODEL,
            assistant_id=ASSISTANT_ID,
            short_term_capacity=7,
            mid_term_heat_threshold=5,
            retrieval_queue_capacity=7,
            long_term_knowledge_capacity=100
        )
        print("MemoryOS initialized.\n")
    except Exception as e:
        print(f"Error: {e}")
        return

    # 2. Add memories to the system
    print("Adding memories...")
    
    memo.add_memory(
        user_input="Hi! I'm Tom, I work as a data scientist in San Francisco.",
        agent_response="Hello Tom! Nice to meet you. Data science is such an exciting field. What kind of data do you work with?"
    )
     
    # 3. Test memory recall
    test_query = "What do you remember about my job?"
    print(f"User: {test_query}")
    
    response = memo.get_response(
        query=test_query,
    )
    
    print(f"Agent: {response}")

if __name__ == "__main__":
    simple_demo()