This makes me wonder, what's the equivalent to ollama for whisper/SOTA OS tts mo...

lhl · on Nov 2, 2023

For SRT, here are some front-ends: https://www.reddit.com/r/OpenAI/comments/163hzhe/recommended...

Also I saw this thing called WhisperScript that looks pretty slick: https://github.com/openai/whisper/discussions/1028

That being said, WhisperX isn't that hard to setup. My step by step from a couple months ago: https://llm-tracker.info/books/logbook/page/transcription-te...

simonw · on Nov 2, 2023

I've been using MacWhisper as a macOS app for running Whisper transcription jobs for a few months, I really like it.

https://goodsnooze.gumroad.com/l/macwhisper

moffkalast · on Nov 2, 2023

McWhisper sounds like a diet burger. Does it come with fries?

ccoreilly · on Nov 2, 2023

Whisper is an STT model, you can use whisperx to transcribe audios locally via the CLI or whisper-turbo.com that runs in the browser.

For TTS coqui has the best UX and models for a lot of languages although quality is not on par with commercial TTS providers.

jcuenod · on Nov 2, 2023

I've just been looking for SOTA TTS. I found coqui.ai and elevenlabs.io (and a bunch of others). They're good (and better than older TTS), but I am not fooled by any of them. Do you have recommendations?

selfhoster11 · on Nov 4, 2023

Gemelo was the other one listed. I doubt you'll get anything sounding more natural than ElevenLabs with the following settings:

* Model: Multilingual v2

* All options and sliders to boost similarity: set to max/yes

* Stability slider: experimentally set to a value where the model sounds natural enough without destabilising sound output