Multilingual (31 languages) stereo 48 kHz text-to-speech with zero-shot voice cloning, powered by OpenMOSS-Team/MOSS-TTS-Local-Transformer-v1.5. Upload a short reference clip to clone a voice, or leave it empty for a default voice.
OpenMOSS-Team/MOSS-TTS-Local-Transformer-v1.5