Cogency is a Python library designed to deploy conversational AI agents with minimal overhead. At its core is a transparent ReAct loop architecture that manages complex, multi-step reasoning. Because the agent processes logic in real time, you can trace every decision as it happens.
Developing a tool-enabled agent requires only a few lines of code. Cogency includes built-in persistent memory that integrates with scalable backends like Pinecone, PGVector, or ChromaDB. The library also features automatic tool discovery: simply drop a tool into your project, and the agent handles routing intelligently without manual configuration. It is designed to be environment-aware, automatically detecting available LLMs and tools.
Beyond basic functionality, Cogency allows you to inject specific personalities, support multi-tenant architectures, and connect to major providers including OpenAI, Anthropic, and Gemini. Engineered for production environments, the framework includes native support for retries, rate limiting, and metrics tracking. The architecture remains fully extensible, allowing you to implement custom tools or memory backends without fighting the underlying framework.
Quick start
from cogency import Agent
agent = Agent("assistant")
async for chunk in agent.stream("What's the weather in Tokyo?"):
print(chunk, end="", flush=True)
What makes Cogency different
See the agent think
👤 Human: Plan a 3‑day Tokyo itinerary considering weather.
🛠️ Tools: web_search, weather_forecast, travel_info
🧠 Reasoning: Need weather forecast to mix outdoor and indoor activities.
⚡️ Action: weather_forecast("Tokyo 3 days")
👀 Observation: Day1: Sunny 25°C, Day2: Rain 18°C, Day3: Cloudy 22°C
🧠 Reasoning: Rain on day2 affects outdoors, need indoor alternatives.
⚡️ Action: travel_info("Tokyo indoor attractions museums")
👀 Observation: TeamLab, Tokyo National Museum, Asakusa Temple (covered)
🧠 Reasoning: Have weather and indoor options, build full itinerary.
⚡️ Action: Write 3‑day plan integrating weather.
🤖 Agent: Here's your 3‑day Tokyo itinerary...
Examples
Basic agent in three lines:
import asyncio
from cogency import Agent
async def main():
agent = Agent("assistant")
async for chunk in agent.stream("What's 25 times 43?"):
print(chunk, end="", flush=True)
asyncio.run(main())
Injecting personality:
pirate = Agent("pirate", personality="a friendly pirate who loves coding")
async for chunk in pirate.stream("Tell me about AI!"):
print(chunk, end="", flush=True)
teacher = Agent("teacher", personality="patient teacher", tone="encouraging", style="conversational")
Multi‑step reasoning:
agent = Agent("travel_planner")
async for chunk in agent.stream("I'm planning a trip to London: what's the weather there? What's the current time? Flight costs $1200, hotel $180 per night for 3 nights – what's the total?")
Custom tool with auto‑discovery:
from cogency import Agent, BaseTool
class TimezoneTool(BaseTool):
def __init__(self):
super().__init__("timezone", "Get current time in any city")
async def run(self, city: str):
return {"time": f"{city} current time: 14:30 PST"}
def get_schema(self):
return "timezone(city='string')"
agent = Agent("time_assistant", tools=[TimezoneTool()])
Memory backends:
from cogency import Agent, FSMemory
from cogency.memory.backends import ChromaDB, Pinecone, PGVector
agent = Agent("memory_agent", memory=FSMemory())
agent = Agent("vector_agent", memory=ChromaDB())
agent = Agent("cloud_agent", memory=Pinecone(api_key="...", index="my-index"))
ReAct loop architecture
Cogency employs a transparent ReAct cycle to navigate multi‑step tasks:
Every step is streamed live, ensuring the entire process is fully traceable.
Installation
pip install cogency
echo "OPENAI_API_KEY=sk-..." >> .env # or your preferred provider
Full installation with all dependencies:
pip install cogency[all]
Selective installations:
# LLM providers
pip install cogency[openai] # OpenAI GPT
pip install cogency[anthropic] # Claude
pip install cogency[gemini] # Google Gemini
pip install cogency[mistral] # Mistral AI
# Memory backends
pip install cogency[chromadb] # ChromaDB
pip install cogency[pgvector] # PostgreSQL with pgvector
pip install cogency[pinecone] # Pinecone
# Embedders
pip install cogency[sentence-transformers] # Local embeddings
pip install cogency[nomic] # Nomic
Output modes
Summary mode (returns final result):
result = await agent.run("What's 15 times 23?")
print(result) # "345"
Streaming mode (full transparency):
async for chunk in agent.stream("What's 15 times 23?"):
print(chunk, end="", flush=True)
# 👤 Human: What's 15 times 23?
# 🧠 Reasoning: Need calculator for 15 * 23
# ⚡️ Action: calculator(expression="15 * 23")
# 👀 Observation: Result: 345
# 🤖 Agent: The answer is 345
Multi‑tenancy
await agent.run("Remember my favorite color is blue", user_id="user1")
await agent.run("What's my favorite color?", user_id="user1") # Returns "blue"
await agent.run("What's my favorite color?", user_id="user2") # No memory found
Supported services
Extensibility
@tool
class MyTool(BaseTool):
async def run(self, param: str):
return {"result": f"Processed: {param}"}
class MyMemory(MemoryBackend):
async def memorize(self, content: str): pass
async def recall(self, query: str): pass
AI Trading Simulator: Paper Trade Crypto With Smart LLM Decisions
Prompt Tools: Open-Source Desktop App to Stop Losing Your Best AI Prompts
MiMo-Audio: 100M-Hour Pretrained Model for Few-Shot Speech Tasks
VibeVoice: Long-Form Multi-Speaker TTS for Natural Dialogue Generation
AoxVPN 8.8 Member Day Sale | No-Log VPN Featuring IEPL Private Lines
12306-mcp: Query China Train Tickets via MCP Server and LLMs
AI-Powered Stock Research Generator with Automated Financial Charting
ThinkChain: Stream Claude's Reasoning with Local Tools and MCP
Memvid: Store Millions of Text Chunks in a Single MP4 File
Build Web Apps Using Only SQL: A Guide to SQLPage
AgentCPM-GUI: A Local LLM Agent for Navigating Chinese Mobile Apps
How to Build a Meeting Prep Agent with Tavily and Google Calendar