Voicebox is a desktop app for voice cloning and...
INSTAGRAM

Voicebox is a desktop app for voice cloning and speech synthesis. Feed it a few seconds of audio and it clones the voice using models like Qwen3-TTS. Then you arrange the generated speech on a timeline — drag clips around, layer different voices, edit the output like you would in a DAW. Includes Whisper for transcription so you can edit based on the actual text. Runs locally. No cloud processing, no subscription. https://github.com/jamiepine/voicebox #github #opensource

0:24 Mar 27, 2026 34,571 2,811
@github.awesome
71 words 90% confidence
VoiceBox is a desktop app for voice cloning and speech synthesis. Feed it a few seconds of audio, and it clones the voice using models like Quen3TTS. Then you arrange the generated speech on a timeline, drag clips around, layer different voices, edit the output like you would in a DAW. Includes Whisper for transcription so you can edit based on the actual text. Runs locally. No cloud processing. No subscription.

VoiceBox is a desktop app for voice cloning and speech synthesis. It allows users to edit and arrange generated speech locally without subscriptions or cloud processing.

  1. VoiceBox clones voices from a few seconds of audio input.
  2. It uses models like Qwen3-TTS for voice synthesis.
  3. Users can arrange speech on a timeline like in a DAW.
  4. The app allows layering of different voices and editing.
  5. Whisper integration provides transcription for text editing.
  6. VoiceBox operates locally without cloud processing.
  7. There are no subscription fees for using the app.
  • Blog post: How to use VoiceBox for audio projects
  • Video tutorial: Voice cloning with VoiceBox
  • Social media post: Benefits of local voice synthesis apps

Save videos. Search everything.

Build your personal library of inspiration. Find any quote, hook, or idea in seconds.

Create Free Account No credit card required
Original