A demo of a #spiel sample app, the voices used in order are: eSpeakNG's "Andy" variant, MBROLA US2, and Piper's Amy. You can observe the different features like word tracking and quality. #speech#tts#gnome#linux
Honestly, since the fast variants of the voices are a thing, I think I could really switch to the Sonata Neural Voices in NVDA full time. Now remember folks, these are AI voices. Scary, untrustworthy, AI voices that will smear your reputation all over fedi for using these voices! See, they even react to exclamation marks! Isn't that scary? :) Nah, the worst that'll happen, mainly with the HFC male and female, is that big numbers are garbled together. But every other voice does fine. I use Amy for work, and HFC for reading because those are among the most lively voices I've ever heard. And amazingly enough, we can make our own new voices. So, some people, from the Github repo's readme, are building more professional voices. And there are already versions of old TTS engines from the past that have been brought back to some semblence of life with this tech.
OpenAI debuts Voice Engine, which lets users generate synthetic copy of a voice from a 15-second sample, available to around 100 partners, including HeyGe. In other words, it's not available to the public just yet.
Maybe we have an open source competitor for ElevenLabs? Check out their demo which they switch between original and synthesized. I can't tell. lol Apparently they're going to fully open source codebase and model weights. #TTS#AI#MLhttps://jasonppy.github.io/VoiceCraft_web/
To clear up the hashtags a little bit:
Think of the components of a voice assistant / smartspeaker.
You need #stt (speech-to-text) or #asr (automatic speech recognition) on the "input" side of a user request and #tts (text-to-speech) on the "output" side.
To throw in another technology - #nlp (natural language processing) is used in the "middle" to really understand what the user request is all about.
We need a dbus interface to get a system-wide Text To Speech provider, and Flatpak apps should be able to register themselves as TTS providers.
In GNOME settings there should be an option to disable the current TTS provider, open its settings or switch to another one. Similarly to how android manages multiple keyboards, which you can install from the play store.
The same goes for Speech To Text. You should be able to install your favorite STT provider, with your preferred voice, from the store
La ROM de mon smartphone n'a pas de système de synthèse vocale (#TTS). Geovelo m'invite à télécharger... le système #Google (Speech Recognition & Synthesis)
PAS ENVIE. Plus confiance.
Connaissez-vous un système TTS libre Android (pour LineageOS par exemple) qui supporte 🇨🇵🇬🇧 voire 🇪🇸🇵🇹 ?
Another interesting #TTS#AI system. I need to look closer into it in order to see if it's a voice cloning approach or something else: https://github.com/yl4579/StyleTTS2
Here's a #BookReview I wrote of Tobias Dengel and Karl Weber's "The Sound of the Future" - which claims that #voice#technology like #ASR, #TTS and #synthetic#speech are transformative, and that businesses should start to invest heavily in them.
While the book covers a lot of ground, it leaves many more critical questions unanswered in its unabashed techno-optimism.
I have no idea how such a shitty product can exist. Same price buys you a Chromebook; there are plenty of software libre distraction-free writing apps out there (try opening a terminal and typing "vim"?).
Or you could chicken out and buy a Kindle Fire Max 11 with keyboard case for the same price.
Both of these let you type for more than a day on a charge: the only benefit of the freewrite alpha is an 80 hour battery, which is pointless with USB-C charging everywhere.
@cstross some #TechIlliterate even recommended this shit to me instead if a #Laptop when I was in school.
I told them unless it comes with the same #TTS voice as #StevenHawking has I don't want them to ever be allowed to make any technical decision or suggestion in their life!
Those things are like #TexasInstruments#calculators: an absolute #ripoff given even the shittiest #Netbook with the abundant #Intel#Z3735F#SoC running #OS1337 is more versatile.
And I literally just started that distro.
Wait, there was a new version of AIVoice, AIVoice 2 that I didn't know about, and it works on Mac now. Useful for creating audio for listening compression? Well…