Semlib is a Python library designed to build data pipelines powered by Large Language Models (LLMs). Rather than writing manual prompting and parsing logic, you describe your requirements using natural language. Semlib manages the heavy lifting—prompting, output parsing, concurrency, caching, and cost tracking—under the hood. It brings familiar functional programming primitives like map, reduce, sort, and filter to the world of AI.
Why adopt this approach? Deconstructing complex tasks into semantic steps offers several practical advantages:
map and reduce operations run concurrently, significantly reducing total execution time.pip install semlib
Here is a basic implementation:
# Retrieve a list of U.S. presidents
presidents = await prompt(
"Who were the 39th through 42nd presidents of the United States?",
return_type=Bare(list[str])
)
# Sort them based on political leaning
await sort(presidents, by="right-leaning", reverse=True)
# -> ['Ronald Reagan', 'George H. W. Bush', 'Bill Clinton', 'Jimmy Carter']
# Locate a specific entry
await find(presidents, by="former actor")
# -> 'Ronald Reagan'
# Calculate their age at inauguration
await map(
presidents,
"How old was {} when he took office?",
return_type=Bare(int),
)
# -> [52, 69, 64, 46]
Feeding a massive dataset into a single LLM prompt rarely yields optimal results. Semlib provides a more reliable path: decompose the workload, process it step-by-step, and maintain granular control over the pipeline.
Customer Support – Analyze thousands of support tickets to automatically classify issues and extract key information.
Academic Research – Semantically sort through large collections of abstracts to find and recommend relevant papers.
Sentiment Analysis – Aggregate and synthesize feedback from product reviews at scale.
Content Processing – Filter and extract structured data from resumes, reports, or diverse document sets.
Earth Copilot: Query Geospatial Data Using Natural Language
DupCheck: Open-Source Image Duplication & Tampering Detection (Python)
XunLong Review: AI Content Engine That Writes Reports, Fiction & Decks
Semlib: Build LLM Pipelines With Map, Filter, and Sort in Python
Eigent: Multi-Agent Workflow Desktop App with CAMEL and MCP
Windows-Use: Enabling LLMs to Control the Windows GUI Without Vision Models
FossFLOW: Offline-Ready Isometric Diagram Builder for the Browser
Zen Browser: about:config Tweaks, 1Password Setup, and Customization Guide
Turso Database: A Rust-Based SQLite-Compatible Engine
LLM Bridge: A Unified API Schema for OpenAI, Claude, and Gemini
How to Install and Use Vosk Offline Speech Recognition
Lapce: A Fast, Rust-Powered Code Editor with Remote Development