Voice & Speech Tools

Voice & Speech Tools
Voice & Speech ToolsIndex-TTS-LoRA: Fine-Tuning Voice Models for Natural Speech Synthesis
Learn how to extract audio tokens, fine-tune with LoRA, and generate natural speech. Includes training commands, inference steps, and WER benchmarks compared to the base model.
Voice & Speech Tools
Voice & Speech ToolsQwen3-ASR-Studio: Real-Time Voice Recognition with PiP Mode
Qwen3-ASR-Studio converts speech into text with high efficiency. Upload files, record live via a waveform interface, add context hints, and utilize PiP mode for global voice input. All data remains stored locally.
Voice & Speech Tools
Voice & Speech ToolsVibeVoice: Long-Form Multi-Speaker TTS for Natural Dialogue Generation
VibeVoice generates podcast-style dialogues featuring up to four speakers for durations of up to 90 minutes. It utilizes a specialized tokenizer and diffusion framework. Includes installation guides and demos.
Voice & Speech Tools
Voice & Speech ToolsChatterbox TTS API: Open Source Text-to-Speech for Developers
Integrate high-quality voice synthesis into your applications with the Chatterbox TTS API. This open-source solution offers flexible voice parameters and simple RESTful integration. Get started with Node.js in minutes.
Voice & Speech Tools
Voice & Speech ToolsHow to Install and Use Vosk Offline Speech Recognition
Set up Vosk offline speech recognition on Android, iOS, Python, or servers. Includes installation steps and code examples for mobile, desktop, and cloud.
Voice & Speech Tools
Voice & Speech ToolsKVoiceWalk: Clone Any Voice for Kokoro TTS Using Random Walks
Clone specific voice styles for Kokoro TTS using random walk algorithms and hybrid scoring. Minimize overfitting, boost similarity from 71% to 93%, and produce natural-sounding speech.
Voice & Speech Tools
Voice & Speech ToolsTurn eBooks & PDFs into Audio with Abogen – Fast TTS Tool
Abogen converts ePub, PDF, and text files into high-quality audio in seconds. Utilizing the Kokoro-82M model for natural voices, it generates subtitles and operates on Windows, Mac, Linux, and Docker.
Voice & Speech Tools
Voice & Speech Toolssherpa-onnx: Offline Speech Recognition, TTS, and VAD Without the Cloud
Deploy speech recognition, text-to-speech, and speaker identification entirely offline. sherpa-onnx supports 11 programming languages and multiple platforms, including mobile and embedded devices, with pre-trained models ready for deployment.
Voice & Speech Tools
Voice & Speech ToolsChatTTS: A Text-to-Speech Model Optimized for Dialogue
ChatTTS generates natural, expressive speech with precise control over laughter, pauses, and tone. Explore code examples, hardware requirements, and speaker embedding techniques.
1