DeepResearch, developed by Tongyi Lab, is an agentic large language model optimized for extensive, in-depth information retrieval. The model comprises 30.5 billion total parameters, with 3.3 billion activated per token. It is built on a fully automated synthetic data pipeline covering agent pretraining, supervised fine-tuning, and reinforcement learning. Continuous pretraining on vast amounts of agent-interaction data ensures the model remains current and proficient at complex reasoning.
Tongyi DeepResearch employs end-to-end reinforcement learning via a custom Group Relative Policy Optimization (GRPO) framework utilizing token-level policy gradients. The model supports two distinct inference modes: ReAct, designed for rigorous evaluation of core capabilities, and IterResearch "Heavy," optimized for extracting peak performance during complex research tasks.
Automated Data Pipeline Uses end-to-end synthetic data to drive agent pretraining, supervised fine-tuning, and reinforcement learning stages.
Large-Scale Continual Pretraining Leverages diverse agent-interaction datasets to expand the model's capabilities and refresh its knowledge base.
End-to-End Reinforcement Learning Utilizes the GRPO framework and token-level policy gradients to optimize agent behavior directly.
Dual Reasoning Modes Distinguishes between ReAct mode for core metric isolation and IterResearch "Heavy" mode for maximum depth and performance.
Tongyi DeepResearch demonstrates industry-leading performance across major agentic search benchmarks:
| Benchmark | Score |
|---|---|
| Humanity's Last Exam | 32.9 |
| BrowseComp | 43.4 |
| BrowseComp-ZH | 46.7 |
| GAIA | 70.9 |
| xbench-DeepSearch | 75.0 |
| WebWalkerQA | 72.2 |
| FRAMES | 90.6 |
Evaluations indicate the model outperforms several leading competitors, including GLM 4.5, DeepSeek V3.1, Claude-4-Sonnet, and OpenAI o3.
conda create -n react_infer_env python=3.10.0
conda activate react_infer_env
pip install -r requirements.txt
eval_data/ directory.{"question": "...","answer": "..."}Edit run_react_infer.sh and set the following variables:
MODEL_PATH: Location of the model weights.DATASET: Dataset identifier.OUTPUT_PATH: Directory for results.bash run_react_infer.sh
| Model | Sources | Size | Context Length |
|---|---|---|---|
| Tongyi-DeepResearch-30B-A3B | HuggingFace / ModelScope | 30B-A3B | 128K |
Tongyi DeepResearch serves as the foundation for a broader suite of research agents. This family includes specialized projects such as WebWalker, WebDancer, WebSailor, and WebShaper, which address vision-language tasks, long-horizon reasoning, dynamic outline construction, and other research-intensive applications.
AI Trading Simulator: Paper Trade Crypto With Smart LLM Decisions
DupCheck: Open-Source Image Duplication & Tampering Detection (Python)
PromptEnhancer: Rewrite Any Prompt for Stunning AI Images
Smart-Admin Setup Guide: Environment, Backend, Frontend, and Mobile Deployment
Windows-Use: Enabling LLMs to Control the Windows GUI Without Vision Models
UTCP Explained: A Universal Tool Calling Protocol for APIs, LLMs, and Beyond
NotebookLlama: An Open-Source NotebookLM Alternative with AI Voice
Agents From Scratch: AI Email Assistant with Human-in-the-Loop Approval
Chatterbox TTS API: Open Source Text-to-Speech for Developers
QSV: Slice, Query, and Clean Massive CSV Files with High Performance
Cuby Text: Open-Source Block-Based Knowledge Management
How to Create a 3D Grouped Bar Chart in Origin2024 | Step-by-Step Guide