Open Deep Research: Customizable AI Agents for Automated Report Generation

7月28日 Published inAI Agent Tools

Open Deep Research is an open-source framework designed for comprehensive, automated information gathering. Unlike closed systems, it does not restrict you to a single model or search tool; instead, it offers a fully configurable environment. The system orchestrates multiple AI providers, search APIs, and Model Context Protocol (MCP) servers to conduct research in parallel. Through a visual web interface, users can manage ongoing tasks and refine data. You maintain complete control over the process—selecting preferred search APIs, setting concurrency levels, and adjusting iteration depth.

The framework relies on a suite of specialized models for summarization, research, compression, and report generation. To function effectively, these models must support structured output, tool calling, and integration with your chosen search API. Open Deep Research accommodates local files and remote MCP servers, includes a dedicated batch evaluation suite, and provides several deployment paths: LangGraph Studio, LangGraph Platform, or the Open Agent Platform. It also includes two legacy automation strategies for backward compatibility.

Installation and Setup

  1. Clone the repository

    git clone https://github.com/langchain-ai/open_deep_research.git
    cd open_deep_research
    
  2. Configure environment variables

    Copy the example environment file and define your specific models, search tools, and API credentials.

    cp .env.example .env
    
  3. Initialize the assistant

    Launch the LangGraph server locally. Your browser should open the interface automatically.

    macOS

    # Install the uv package manager
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # Install dependencies and launch the LangGraph server
    uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev --allow-blocking
    

    Windows / Linux

    # Install dependencies
    pip install -e .
    pip install -U "langgraph-cli[inmem]"
    
    # Start the LangGraph server
    langgraph dev
    
  4. Access the Studio UI

    • 🚀 API Endpoint: http://127.0.0.1:2024
    • 🎨 Studio Interface: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
    • 📚 API Documentation: http://127.0.0.1:2024/docs

How It Works: Two Approaches

Multi-Agent Architecture

  1. Provide a topic. The agent immediately begins synthesizing data and generating a report.
  2. The final output is delivered as a formatted Markdown file.

Workflow Architecture

  1. Enter your research topic.
  2. The system generates a comprehensive report plan for your review.
  3. You can provide text feedback to revise or pivot the plan.
  4. To approve, type true in the Studio JSON input box.
  5. Following approval, the system systematically writes each section of the report.
  6. The completed report is presented in Markdown format.

Search Tools and Compatible Models

Supported Search APIs

  • Tavily API – Optimized for general web search.
  • Perplexity API – Conversational web search integration.
  • Exa API – Neural search designed for high-quality web content.
  • ArXiv – Access to academic papers in physics, mathematics, and computer science.
  • PubMed – Access to biomedical literature and life sciences journals.
  • Linkup API – Broad web search capabilities.
  • DuckDuckGo API – Privacy-focused web search.
  • Google Search API / Scraper – Integration requiring a Custom Search Engine (CSE) and API key.
  • Microsoft Azure AI Search – Integration for cloud-based vector databases.

Compatible LLMs

The framework is compatible with any model supported by the init_chat_model() API.

Two Implementations in Detail

Open Deep Research provides two distinct methodologies for report generation.

1. Graph-Based Workflow (src/open_deep_research/graph.py)

This implementation utilizes a structured plan-and-execute logic.

  • Planning phase: A designated planner model analyzes the topic to construct a detailed report outline.
  • Human-in-the-loop: This stage allows users to inspect and authorize the plan before the research agent executes it.
  • Sequential research: The agent reflects on search results between iterations, writing sections progressively.
  • Section-specific research: Each individual section triggers its own targeted search queries.
  • Broad search tool support: Seamlessly integrates with Tavily, Perplexity, Exa, ArXiv, PubMed, Linkup, and more.

This method offers superior control and is recommended for projects requiring high accuracy and specific structural requirements.

Customize the workflow using the following parameters:

  • report_structure – Define a custom outline (defaults to standard research format).
  • number_of_queries – Set the number of search queries generated per section (default: 2).
  • max_search_depth – Set the limit for reflection and search iterations (default: 2).
  • planner_provider – Specify the model provider for planning (default: anthropic).
  • planner_model – Choose the specific planning model (default: claude-3-7-sonnet-latest).
  • planner_model_kwargs – Pass additional arguments to the planner.
  • writer_provider – Specify the model provider for writing (default: anthropic).
  • writer_model – Choose the model for generating the report text (default: claude-3-5-sonnet-latest).
  • writer_model_kwargs – Pass additional arguments to the writer.
  • search_api – Select the primary web search API (default: tavily).

2. Multi-Agent Implementation (src/open_deep_research/multi_agent.py)

This version utilizes a supervisor-researcher hierarchy.

  • Supervisor agent: Orchestrates the process, plans sections, and compiles the final report.
  • Researcher agents: Multiple independent agents operate in parallel, each responsible for a single section.
  • Parallel processing: Simultaneous research of all sections drastically reduces total generation time.
  • Specialized tools: Researchers are equipped with search tools, while the supervisor focuses on planning.
  • Search limitation: This implementation currently supports Tavily exclusively, with plans for future expansion.

This approach is best suited for time-sensitive tasks where rapid delivery is prioritized over manual oversight.

Configuring Search APIs

Search API parameters vary by provider. Below are the supported configurations for specific tools:

Exa: max_characters, num_results, include_domains, exclude_domains, subpages

  • Note: include_domains and exclude_domains are mutually exclusive.
  • Use domain filtering to focus research on authoritative sources, such as government or academic portals.
  • Exa also provides AI-generated summaries for each retrieved result.

ArXiv: load_max_docs, get_full_documents, load_all_available_meta

PubMed: top_k_results, email, api_key, doc_content_chars_max

Linkup: depth

Example Exa configuration snippet:

thread = {"configurable": {"thread_id": str(uuid.uuid4()),
                           "search_api": "exa",
                           "search_api_config": {
                               "num_results": 5,
                               "include_domains": ["nature.com", "sciencedirect.com"]
                           },
                           # additional configuration...
                           }}

Model Selection Notes

  1. Ensure chosen models are compatible with init_chat_model().
  2. Both planner and writer models must be capable of producing structured output.
  3. Agent models must demonstrate high proficiency in tool calling. Recommended models include Claude 3.7, o3, o3-mini, and GPT-4.
  4. Groq users on the on_demand tier should note the 6000 TPM (Tokens Per Minute) limit; generating extensive reports may require a paid tier for higher throughput.
  5. Models like deepseek-R1 may struggle with function calling, a capability essential for structured section generation and scoring within this framework. For best results, use providers known for robust tool integration: OpenAI, Anthropic, or high-capacity open models like Groq's llama-3.3-70b-versatile.

If you encounter the following error, it typically indicates a model's failure to return structured data:

groq.APIError: Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.
  1. OpenRouter integration is fully supported—refer to their documentation for connection strings.

Testing Report Quality

You can evaluate and compare the output of both implementations using the provided test script.

# Run tests using default Anthropic models
python tests/run_test.py --all

# Run tests using OpenAI o3 models
python tests/run_test.py --all \
  --supervisor-model "openai:o3" \
  --researcher-model "openai:o3" \
  --planner-provider "openai" \
  --planner-model "o3" \
  --writer-provider "openai" \
  --writer-model "o3" \
  --eval-model "openai:o3" \
  --search-api "tavily"

Performance logs are sent to LangSmith, enabling side-by-side quality comparisons between different model configurations.