Voicebox is a desktop app for voice cloning and...
VoiceBox is a desktop app for voice cloning and speech synthesis. Feed it a few seconds of audio, and it clones the voice using models like Quen3TTS. Then you arrange the generated speech on a timeline, drag clips around, layer different voices, edit the output like you would in a DAW. Includes Whisper for transcription so you can edit based on the actual text. Runs locally. No cloud processing. No subscription.
Summary
VoiceBox is a desktop app for voice cloning and speech synthesis. It allows users to edit and arrange generated speech locally without subscriptions or cloud processing.
Key Points
- VoiceBox clones voices from a few seconds of audio input.
- It uses models like Qwen3-TTS for voice synthesis.
- Users can arrange speech on a timeline like in a DAW.
- The app allows layering of different voices and editing.
- Whisper integration provides transcription for text editing.
- VoiceBox operates locally without cloud processing.
- There are no subscription fees for using the app.
Tags
Repurpose Ideas
- Blog post: How to use VoiceBox for audio projects
- Video tutorial: Voice cloning with VoiceBox
- Social media post: Benefits of local voice synthesis apps
Save videos. Search everything.
Build your personal library of inspiration. Find any quote, hook, or idea in seconds.
Create Free Account No credit card required