DeerFlow: Modular Multi-Agent Research With LangGraph and MCP

5月10日 Published inDeep Learning

DeerFlow bridges the gap between large language models and functional tools, including web search, web crawling, and Python execution. Built on a modular multi-agent architecture using LangGraph, the system automates complex research and code analysis through a streamlined, high-performance workflow.

What DeerFlow does

DeerFlow provides broad model support via litellm, making it compatible with various providers and open-source models like Qwen. It features an OpenAI-compatible API and employs a tiered LLM system to match task complexity with the appropriate model.

Tools and MCP integration

  1. Search and retrieval: The framework utilizes Tavily, Brave Search, and other specialized services for web discovery. Jina serves as the primary tool for deep content extraction and web crawling.
  2. MCP integration: Support for the Model Context Protocol (MCP) allows the system to access private domains, knowledge graphs, and browsing tools. This modularity ensures that new research methods can be integrated seamlessly.

Human-in-the-loop

  1. Plan editing: Users can modify research plans using natural language. The system allows for automatic plan acceptance or interactive manual adjustments.
  2. Post-report editing: The platform includes a block-style editor powered by tiptap, offering an experience similar to Notion. Users can use AI to polish, shorten, or expand specific sections of the report.

Content creation

Beyond reports, DeerFlow can generate podcast scripts and synthesize audio. It also supports the creation of customized PowerPoint decks derived from templates.

Getting started

DeerFlow is built with a Python backend and a Node.js web frontend. The following tools are recommended to streamline the installation process:

  • uv: Manages Python environments and dependencies automatically, eliminating manual setup.
  • nvm: Manages multiple Node.js versions.
  • pnpm: Handles Node dependency installation and management.

System requirements:

  • Python 3.12+
  • Node.js 22+

Installation steps:

  1. Clone the repository:

    git clone https://github.com/bytedance/deer-flow.git
    cd deer-flow
    
  2. Install dependencies:

    uv sync
    
  3. Configure the .env file. Add your API keys for Tavily, Brave Search, and other services. Volcengine TTS credentials should be added here if needed:

    cp .env.example .env
    
  4. Configure conf.yaml for LLM models and API keys:

    cp conf.yaml.example conf.yaml
    
  5. Install marp for PowerPoint generation:

    brew install marp-cli
    
  6. (Optional) Install web UI dependencies:

    cd deer-flow/web
    pnpm install
    

Refer to the Configuration Guide for comprehensive details. Ensure all settings are updated before launching the application.

The fastest way to initiate DeerFlow is through the console interface:

uv run main.py

For a more robust experience, use the web UI. Ensure the web dependencies are installed first.

On macOS/Linux, start both the backend and frontend in development mode:

./bootstrap.sh -d

On Windows:

bootstrap.bat -d

Navigate to http://localhost:3000 to access the web interface.

Supported search engines

Configure the SEARCH_API variable in your .env file. Available options include:

  • Tavily (default): An AI-optimized search API. Set your TAVILY_API_KEY after signing up at app.tavily.com/home.
  • DuckDuckGo: A privacy-focused search engine that requires no API key.
  • Brave Search: Privacy-oriented search with advanced features. Set your BRAVE_SEARCH_API_KEY at brave.com/search/api/.
  • Arxiv: Specifically targets scientific and academic papers. No API key is required.

Example .env configuration:

# Options: tavily, duckduckgo, brave_search, arxiv
SEARCH_API=tavily

How DeerFlow is built

DeerFlow utilizes a modular multi-agent architecture with LangGraph providing the foundation for its state-based workflows. System components communicate through a structured messaging protocol.

The workflow components:

Coordinator: This is the entry point for the workflow lifecycle. It initiates research processes based on user input and hands tasks to the Planner. It serves as the primary interface between the user and the system.

Planner: This component decomposes high-level goals into structured execution plans. It evaluates whether sufficient context has been gathered or if further research is required. It manages the overall research flow and determines the optimal time to generate the final report.

Research Team: A collective of specialized agents. Researchers utilize web search, crawlers, and MCP services to collect data. The coding agent uses Python REPL tools for code analysis and execution. Each agent operates within LangGraph and has access to a specific suite of tools.

Report Generator: In the final stage, this component compiles findings from the Research Team. it processes and organizes the gathered information to output a comprehensive, structured report.

Text-to-speech integration

DeerFlow integrates Text-to-Speech (TTS) capabilities via the Volcengine API, allowing users to generate high-quality audio from research reports. Parameters such as speed, volume, and pitch are fully customizable.

To use the TTS feature, call the /api/tts endpoint:

curl --location 'http://localhost:8000/api/tts' \
--header 'Content-Type: application/json' \
--data '{     "text": "This is a test of the text-to-speech functionality.",     "speed_ratio": 1.0,     "volume_ratio": 1.0,     "pitch_ratio": 1.0 }' \
--output speech.mp3