Transformers is the foundational framework for state-of-the-art machine learning. It supports text, computer vision, audio, video, and multimodal models for both inference and training.
The library serves as a central hub for model definitions, establishing a unified standard across the machine learning ecosystem. Because transformers is cross-framework, a model defined here works seamlessly with various training frameworks (such as Axolotl, Unsloth, DeepSpeed, FSDP, and PyTorch-Lightning), inference engines (vLLM, SGLang, TGI), and modeling libraries (llama.cpp, mlx). All of these tools rely on transformers to interpret the underlying model architecture.
We are committed to supporting new state-of-the-art models as they emerge. By prioritizing clear, customizable, and efficient model definitions, we make these advanced tools accessible to a wider audience of developers and researchers.
The Hugging Face Hub currently hosts over 1 million Transformers model checkpoints. You can explore the Hub to find the right model for your specific needs and start building immediately.
Transformers supports Python 3.9+, PyTorch 2.1+, TensorFlow 2.6+, and Flax 0.4.1+.
To begin, create and activate a virtual environment using venv or uv (a high-performance Rust-based Python package manager).
# venv
python -m venv .my-env
source .my-env/bin/activate
# uv
uv venv .my-env
source .my-env/bin/activate
Install Transformers within your virtual environment:
# pip
pip install "transformers[torch]"
# uv
uv pip install "transformers[torch]"
If you wish to contribute to the library or access the latest experimental features, install it from source. Note that the development version may be unstable; please file an issue if you encounter bugs.
git clone https://github.com/huggingface/transformers.git
cd transformers
# pip
pip install .[torch]
# uv
uv pip install .[torch]
The Pipeline API is the fastest way to get started. The pipeline function is a high-level inference class designed for text, audio, vision, and multimodal tasks. It automatically manages input preprocessing and returns structured results.
To generate text, instantiate a pipeline and specify a model. The library downloads and caches the model locally for future use. Once initialized, simply pass your text as a prompt.
from transformers import pipeline
pipeline = pipeline(task="text-generation", model="Qwen/Qwen2.5-1.5B")
pipeline("the secret to baking a really good cake is ")
Output:
[{'generated_text': 'the secret to baking a really good cake is 1) to use the right ingredients and 2) to follow the recipe exactly. the recipe for the cake is as follows: 1 cup of sugar, 1 cup of flour, 1 cup of milk, 1 cup of butter, 1 cup of eggs, 1 cup of chocolate chips. if you want to make 2 cakes, how much sugar do you need? To make 2 cakes, you will need 2 cups of sugar.'}]
The workflow for chat models is similar: construct a conversation history between the user and the system, then pass it to the pipeline.
You can also interact with models directly from your terminal:
transformers chat Qwen/Qwen2.5-0.5B-Instruct
import torch
from transformers import pipeline
chat = [
{"role": "system", "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},
{"role": "user", "content": "Hey, can you tell me any fun things to do in New York?"}
]
pipeline = pipeline(task="text-generation", model="meta-llama/Meta-Llama-3-8B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")
response = pipeline(chat, max_new_tokens=512)
print(response[0]["generated_text"][-1]["content"])
from transformers import pipeline
pipeline = pipeline(task="automatic-speech-recognition", model="openai/whisper-large-v3")
pipeline("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
Output:
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}
from transformers import pipeline
pipeline = pipeline(task="image-classification", model="facebook/dinov2-small-imagenet1k-1-layer")
pipeline("https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png")
Output:
[{'label': 'macaw', 'score': 0.997848391532898},
{'label': 'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',
'score': 0.0016551691805943847},
{'label': 'lorikeet', 'score': 0.00018523589824326336},
{'label': 'African grey, African gray, Psittacus erithacus',
'score': 7.85409429227002e-05},
{'label': 'quail', 'score': 5.502637941390276e-05}]
from transformers import pipeline
pipeline = pipeline(task="visual-question-answering", model="Salesforce/blip-vqa-base")
pipeline(
image="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-few-shot.jpg",
question="What is in the image?",
)
Output:
[{'answer': 'statue of liberty'}]
State-of-the-art models made accessible
Reduced compute costs and environmental impact
Framework flexibility for every lifecycle stage
Simple customization for specific needs
Transformers is not designed to be a modular, general-purpose toolbox for building neural networks. To help researchers iterate quickly, model files are intentionally written without heavy refactoring or excessive abstraction layers. This allows users to modify architectures directly without navigating complex file hierarchies.
The training API is specifically optimized for PyTorch models within the Transformers ecosystem. For more general machine learning loops, consider libraries like Accelerate.
Additionally, the provided example scripts are intended as templates. They may require modification to suit specific, real-world use cases and are not guaranteed to work out-of-the-box for every scenario.
Transformers is more than just a library; it is a thriving community of projects centered around the Hugging Face Hub. Our goal is to provide the foundation for developers, researchers, and students to bring their ideas to life.
To celebrate reaching 100,000 GitHub stars, we launched the "awesome-transformers" page. This curated list spotlights 100 exceptional projects built using the library. If you have built something—or use a tool—that belongs on that list, we encourage you to submit a pull request.
Most models can be tested instantly through their respective pages on the Hugging Face Hub.
Skill Seeker: Convert Any Documentation Site Into Claude AI Skills
Sunshine Streaming Host Specs: What Hardware You Actually Need
ETF Grid Trading Strategy Design Tool: Smart Parameters & Risk Control
Flyde Visual Programming: Custom Nodes & Code Integration
Shendeng VPN Review: High-Speed Gaming, Video Streaming, and Unlimited Data
TikTok Scraper: Download Watermark-Free Videos Without Login
Firecrawl API: Converting Any Website Into Clean Markdown for LLMs
n8n-MCP: Give Claude Access to 525+ n8n Nodes in Minutes
II-Agent Review: An Open-Source LLM Assistant Built for Autonomous Tasks
Zotero PDF2zh: Translate Academic PDFs Directly Within Zotero
SuperCoder: A Terminal-Based Coding Assistant for Searching, Editing, and Debugging
ONLYOFFICE Docs: A Powerful Online Collaborative Office Suite