The Crawl4AI RAG MCP Server bridges the Model Context Protocol (MCP) with Crawl4AI and Supabase. This integration provides AI agents and coding assistants with robust web crawling and retrieval-augmented generation (RAG) capabilities. With this server, an agent can navigate a website, ingest its content into a vector database, and perform semantic queries against that stored knowledge.
Core Features
Main Tools
crawl_single_page: Extracts data from a single URL and pushes it directly to the vector store.smart_crawl_url: Analyzes the input type—whether a sitemap, llms-full.txt, or a standard page—and executes an intelligent site-wide crawl.get_available_sources: Retrieves a list of all domains currently indexed within the Supabase database.perform_rag_query: Executes a semantic search across your crawled data. Results can be narrowed down to a specific domain if needed.Prerequisites
uv).1. Docker Deployment (Recommended)
git clone https://github.com/coleam00/mcp-crawl4ai-rag.gitcd mcp-crawl4ai-ragdocker build -t mcp/crawl4ai-rag --build-arg PORT=8051 ..env file following the configuration guidelines below.2. Local Setup with uv
git clone https://github.com/coleam00/mcp-crawl4ai-rag.gitcd mcp-crawl4ai-raguv if it is not already on your system: pip install uvuv venv, then .venv\Scripts\activatesource .venv/bin/activateuv pip install -e ., followed by crawl4ai-setup.env file.Database Configuration
crawled_pages.sql file from the repository.Environment Variables
Create a .env file in the project root with the following variables:
# MCP Server Configuration
HOST=0.0.0.0
PORT=8051
TRANSPORT=sse
# OpenAI Configuration
OPENAI_API_KEY=your-openai-key
# Supabase Configuration
SUPABASE_URL=your-supabase-project-url
SUPABASE_SERVICE_KEY=your-supabase-service-key
Launching the Server
docker run --env-file .env -p 8051:8051 mcp/crawl4ai-raguv run src/crawl4ai_mcp.pyThe server will now listen on the host and port defined in your environment variables.
SSE Configuration
To connect via the Server-Sent Events (SSE) transport, use the following configuration in your client:
{
"mcpServers": {
"crawl4ai-rag": {
"transport": "sse",
"url": "http://localhost:8051/sse"
}
}
}
Note for Windsurf users: Use serverUrl instead of url.
Note for Docker users: If your client is running in a separate container, replace localhost with host.docker.internal.
Stdio Configuration
For integration with Claude Desktop, Windsurf, or other standard MCP clients:
{
"mcpServers": {
"crawl4ai-rag": {
"command": "python",
"args": ["path/to/crawl4ai-mcp/src/crawl4ai_mcp.py"],
"env": {
"TRANSPORT": "stdio",
"OPENAI_API_KEY": "your-openai-key",
"SUPABASE_URL": "your-supabase-url",
"SUPABASE_SERVICE_KEY": "your-supabase-service-key"
}
}
}
}
Docker with Stdio
{
"mcpServers": {
"crawl4ai-rag": {
"command": "docker",
"args": ["run", "--rm", "-i",
"-e", "TRANSPORT",
"-e", "OPENAI_API_KEY",
"-e", "SUPABASE_URL",
"-e", "SUPABASE_SERVICE_KEY",
"mcp/crawl4ai-rag"],
"env": {
"TRANSPORT": "stdio",
"OPENAI_API_KEY": "your-openai-key",
"SUPABASE_URL": "your-supabase-url",
"SUPABASE_SERVICE_KEY": "your-supabase-service-key"
}
}
}
}
This implementation provides a modular foundation for an MCP server with integrated web crawling. You can adapt it to your specific needs by:
@mcp.tool() decorator.utils.py.
DeepSeek-OCR: High-Speed Visual Text Compression That Actually Works
Prompt Tools: Open-Source Desktop App to Stop Losing Your Best AI Prompts
Sunshine Streaming Host Specs: What Hardware You Actually Need
Yank Note Review: A Hackable Markdown Editor That Runs Code
RunAgent: Build AI Agents in Python, Invoke Them Natively from Any Language
Windows-Use: Enabling LLMs to Control the Windows GUI Without Vision Models
Larachat: Build a Real-Time AI Chat App with Laravel and React
TensorZero: Optimize LLM Applications with Production Feedback
OCode: Native AI Coding Assistant for Your Terminal (Ollama)
AgentCPM-GUI: A Local LLM Agent for Navigating Chinese Mobile Apps
ACI.dev: 600+ Tools for AI Agents with Built-In Auth and MCP Support
Shendeng VPN: Two Modes to Speed Up Games and Chinese Apps