Audio Language Models

Audio Language Models
Audio Language ModelsMiMo-Audio: 100M-Hour Pretrained Model for Few-Shot Speech Tasks
MiMo-Audio uses over 100 million hours of pretraining to deliver few-shot speech synthesis, editing, and style transfer. See how this 7B model performs.
Video Tools
Video ToolsClipSketch AI: Frame-Accurate Video Tagging & AI Storyboard Generation
ClipSketch AI enables creators to tag video frames from Bilibili and Xiaohongshu, converting them into hand-drawn storyboards via Google Gemini. Streamline your workflow by generating social captions and covers in one place.
Video Tools
Video ToolsWan2.2-Animate: Local Setup Guide for Image-to-Video and Character Consistency
Learn how to deploy Wan2.2-Animate on your local machine. This guide covers repository cloning, dependency installation, and launching the web interface to transform images into videos with high character consistency.
3D Tools
3D ToolsMars3D Vue Examples: 381 Interactive 3D Map Demos and Live Code Editing
A Vue3-based demonstration suite for the Mars3D engine. Edit and run 381 map examples live in your browser. Learn the platform through isolated, modifiable code samples.
AI Tools
AI ToolsFay: Build and Deploy Your Own Talking Digital Human for Free
Fay is an MIT-licensed open-source framework for creating digital humans with voice interaction. Control 3D avatars or Live2D characters and run offline on Python 3.12.
AI Agent Tools
AI Agent ToolsTypeAgent: Build AI Agents With Structured Memory and Human-in-the-Loop
Microsoft's TypeAgent uses Structured RAG and the AMP architecture to build AI agents that maintain context and collaborate with human users. Setup guide included.
1