#TTS - kbin.social

Bearfaced, 10 months ago to homeassistant

I set Home Assistant voice assist as my default on my phone, I'm super impressed with the speed! There's a way to go with the sentence recognition (which I could improve myself or wait for someone smarter to do it). I'm really excited for this project, I have no doubt it'll end up in hardware form and be a fully configurable, privacy conscious voice assistant. #homeassistant #homeautomation #voice #voiceassistant #tts #android

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ homeassistant

itnewsbot, 10 months ago to ArtificialIntelligence

Text-to-Speech Model Can Do Music, Background Noises, and Sound Effects - Bark is a universal text-to-audio model that can not only create realistic speech,... - https://hackaday.com/2023/07/24/text-to-speech-model-can-do-music-background-noises-and-sound-effects/ #artificialintelligence #softwarehacks #texttospeech #generative #llm #tts #ai

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

scruss, 10 months ago to RaspberryPi

for the few that care: DECtalk now builds on 32-bit Raspberry Pi OS (Debian, armhf). It previously only ran on 64-bit OSs.

dectalk/dectalk: Modern builds for the 90s/00s DECtalk text-to-speech application. — https://github.com/dectalk/dectalk

#DECtalk #tts #TextToSpeech #SpeechSynthesis #RaspberryPi #aeiou

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ simon, datajake1999, jaybird110127, objectinspace

hywan, 10 months ago to unity

Do you know how to create a Unity plugin?

Coqui awesome Text-to-Speech project needs you, https://github.com/coqui-ai/TTS/issues/2589. Imagine being able to create any speech from a simple text, in multiple languages, with any voices (including voice cloning), based on open source technologies and state-of-the-art algorithms? You can make it real.

#unity #tts #coqui #plugin #DotNet #python

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

zersiax, 10 months ago to random

random observation but why do so very many #TTS voices make absolutely no distinction between a period and an exclamation mark when the screen reader leaves it up to the TTS voice to interpret them? Question marks, no problem. Exclamation marks ...nope. Just like a period. They are different, folks!

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ devinprater, alcinnz

scruss, 11 months ago to random

Okay AliExpress, third time's a charm. I've ordered another XFS5152CE #TTS speech synthesizer module. Will I get a third SYN6988 board instead?

https://www.aliexpress.com/item/1005001329981692.html

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

jasonnab, 11 months ago to ai

I'd like to use a voice generation tool that isn't deemed harmful or stealing someone else's voice. Is there any sort of "ethical" or community Creative Commons-esque backed model and/or program I can use to turn text into speech, on Linux? Mozilla Voice stuff maybe?
e or gspeak just does not cut it unfortunately with the default voices... Maybe there are better models for that program?

#ai #tts #espeak #linux #help #creativecommons #voice

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

devinprater, 11 months ago to random

Just a thought. Listening to TTS is like driving. You go fast, and you're generally looking for an ending, a place to be, stuff like that.

Braille is like walking, or running if you can read it very quickly. Some people, like me, read Braille to enjoy the scenery, or to get at all the details. Some people, though, rely on Braille. They cannot drive, so they probably have a lot more stamina or can jog and run rather than walk. The same is for Braille and audio.

This probably isn't a great analogy, but it's something I thought of.

#braille #TTS

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

devinprater, 11 months ago (edited 11 months ago) to accessibility

Some blind Android users really want the Eloquence TTS engine back. It will die when 64-bit phones become the norm. They went as far as seriously debating of they could ask phone carriers to step in. It's sad, both because Google could easily have licensed Eloquence, put it in a 32-bit ARM container, and there you go. It's sad that Apple is the only big corporation that spent five minutes and thought "Oh hey we have a license for this now, let's containerize this and ship it for VoiceOver." It's sad that Google doesn't inspire confidence from the blind community at large of Google's ability to uphold an accessible OS and a competitive screen reader. And it's definitely sad that another TTS engine hasn't come along that is any better than Eloquence, which is from the 90's.

#accessibility #apple #google #tts #blind

reply

expand (60)

collapse (60)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

scruss, 11 months ago to random

All Watched Over by Machines of Loving Grace, "read" by a SYN6988 TTS chip driven by MicroPython:

https://soundcloud.com/user8899915/all-watched-over-by-machines-of-loving-grace

#TTS #SpeechSynthesis #MicroPython

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ botwiki, andypiper

jasonnab, 11 months ago to linux

For anyone using a #screenReader on #linux, do you have any recommendations or otherwise? I want to test some of my website designs against screen readers, but I'm unable to get #gnome #orca working properly on #ArchLinux . It doesn't seem to ever read anything but what's in my bash console; I can't get it to read from my web browser or otherwise.

#accessibility #help #question #tts

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

mkiol, 11 months ago to random

If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app).

It is easy to use, works offline and supports 57 languages!

Speech Note works thanks to powerful #STT and #TTS engines underneath: #DeepSpeech #Coqui #Vosk #Whisper #Piper #eSpeak #MBROLA #RHVoice

You can download #SpeechNote from #Flathub: https://flathub.org/apps/net.mkiol.SpeechNote

Video demo: https://youtu.be/EhUPvaHvssw

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ixi, rewarp, linmob

x0, 1 year ago to accessibility

So, I've finally bitten the bullet, reverted my jailbreak, and am upgrading to iOS 16. What 3rd-party speech synthesis providers are available for me to use without needing a Mac to install them from source? I know eSpeak-ng is one, but are there any others? #TTS #blind #accessibility #a11y #apple #iPhone #iOS16

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

ZBennoui, 1 year ago to ai

Over the past year, I've been experimenting with neural text to speech in various forms. I have done hours of experimentation and research, training models and getting varying results along the way. Some of you may have heard of Piper, an open source synthesizer and add on for NVDA that can be trained by anyone. It is currently in active development, and I have been there from the beginning, testing and evaluating the various versions. For years, I have had a goal to create a high-quality voice that is truly usable by a screen reader user, and yesterday I managed to achieve this. I'm really excited to share Alba, a female Scottish English voice. I'm considering this a beta phase, and I'm looking for feedback to make improvements as needed. Please note that you will most likely get an error upon installation, however the voice should still show up to NVDA, and I'm working on fixing this as soon as possible.
Link to Piper: https://github.com/rhasspy/piper/tree/v0.1.0
Link to addon: https://github.com/mush42/piper-nvda?ref=building.open-home.io
Link to Alba: https://drive.google.com/file/d/1wZHuIll6aEEFd4OdLBCVcxF7bd3PbQTB/view?usp=share_link #TTS #AI #ScreenReader #Piper

reply

expand (23)

collapse (23)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999, SteveFaulkner, pitermach

seedy, 1 year ago to random

Coming up on my YouTube channel on Monday, May 15th, 2023 at 5 PM BST, on its 23rd anniversary, an ACB Radio Mainmenu segment on singing speech synthesizers, featuring some DECTalk song renditions and funny skits, hosted by @JonathanMosen and starring @BorrisInABox. See the premier when it goes live here https://youtu.be/UZJS6bmxOJk #DECTalk #TextToSpeech #TTS #2000s #RetroTech #nostalgia #ACB #ACBRadio #mainmenu

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ queenslight, jaybird110127

KathyReid, 1 year ago to random

ICYMI: Do you work with #voice or #speech #data?

You might be a #linguist, or an #ML #engineer, doing things like data specifications, filtering or pre-processing or training #ASR, #STT or #TTS models, or you might work in #fairness or #bias evaluation.

If so, I’d love your help to understand current #dataset #documentation practices, and what we can do to make them better as part of my #PhD #research 🤓 ⌨️ 🎤

The #survey takes 10-20 minutes to complete, and you can opt in to win one of 3 gift cards valued at $AUD 50 each.

Research Protocol 2021/427 approved by #ANU Human Research Ethics Committee

Boosts appreciated 💕

https://anu.au1.qualtrics.com/jfe/form/SV_cSFODa5osYtm96e

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Atexjam

flohgro, 1 year ago to random

SmartHome is when you vacuum decides when it’s time to leave.

It just started vacuuming ignoring that I’m still at home. Turns out that HomeKit thought that I left when I only moved between rooms 🤦🏼‍♂️😂

#SmartHome #HomeKit

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

dis, 1 year ago

@hl @flohgro with #tts mine is super sarcastic. We call it #snarkhome.
"(Name) left me alone in the office, so I turned out the light. .... (Name) left me alone in the dark."
"The office light got lonely so I Did What Had To Be Done."
"Once upon a time someone left the kitchen light on. I turned it off. That is it. That is the story. I turned off a light."

#homeassistant #texttospeech

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

KathyReid, 1 year ago (edited 1 year ago) to random

ICYMI: Do you work with #voice or #speech #data?

You might be a #linguist, or an #ML #engineer, doing things like data specifications, filtering or pre-processing or training #ASR, #STT or #TTS models, or you might work in #fairness or #bias evaluation.

If so, I’d love your help to understand current #dataset #documentation practices, and what we can do to make them better as part of my #PhD #research 🤓 ⌨️ 🎤

The #survey takes 10-20 minutes to complete, and you can opt in to win one of 3 gift cards valued at $AUD 50 each.

Research Protocol 2021/427 approved by #ANU Human Research Ethics Committee

https://anu.au1.qualtrics.com/jfe/form/SV_cSFODa5osYtm96e

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kushal, 1 year ago to programming

What are the good options for #Python text to speech libraries? #TTS #Swedish

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

KathyReid, 1 year ago to random

Do you work with #voice or #speech #data? You might contribute data, write data specifications for collection, perform filtering or pre-processing, train #ASR or #TTS models, or design or perform evaluations on #ML speech models.

If so, I’d love your help to understand current #dataset #documentation practices, and what we can do to make them better as part of my #PhD #research

The #survey takes 10-20 minutes to complete, and you can opt in to win one of 3 gift cards valued at $AUD 50 each

Research Protocol 2021/427 approved by #ANU Human Research Ethics Committee

https://anu.au1.qualtrics.com/jfe/form/SV_cSFODa5osYtm96e

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chris_hayes, 1 year ago to random

I trained ElevenLabs AI on my voice.

The result is quite good (though do you really know what your voice sounds like?).

Audiobook narrators still have a unique skill that AI can't quite replicant, I mean replicate. However, with 2 minutes of audio, it can read outloud a book in my voice far better than I can manage.
#ai #tts

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

KathyReid, 1 year ago to random

Do you work with #voice or #speech #data? You might contribute data, write data specifications for collection, perform filtering or pre-processing, train #ASR or #TTS models, or design or perform evaluations on #ML speech models.

If so, I’d love your help to understand current #dataset #documentation practices, and what we can do to make them better as part of my #PhD #research

The #survey takes 10-20 minutes to complete, and you can opt in to win one of 3 gift cards valued at $AUD 50 each.

Research Protocol 2021/427 approved by #ANU Human Research Ethics Committee

https://anu.au1.qualtrics.com/jfe/form/SV_cSFODa5osYtm96e

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

devinprater, 1 year ago to accessibility

So, the Neural Microsoft voices are very good for reading books. And what did Microsoft remove from Edge? The ability to read EPUB books. Lol, ah well. Samsung TTS is almost as good.

#a11y #accessibility #Microsoft #tts

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ queenslight

Edent, 2 years ago to BBC

TTSF (Text To Shipping Forecast)

The BBC Shipping Forecast is one of those strange bits of national tradition which, somehow, bridges the gap between infrastructure and folklore.

You can listen listen to the latest forecast on the BBC - read by professional newscasters.

But what if we wanted a robot to read it? If our speaker is sick, bored, or too expensive - how would we automate the audio

https://shkspr.mobi/blog/2021/10/ttsf-text-to-shipping-forecast/

#/etc/ #bbc #robot #tts

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Edent

Edent, 2 years ago to ai

Synthetic Poetry

I've been experimenting with Amazon's Polly service. It's their fancy text-to-sort-of-human-style-speech system. Think "Alexa" but with a variety of voices, genders, and accents.

Here's "Brian" - their English, male, received pronunciation voice - reading John Betjeman's poem "Slough":

The pronunciation of all the words is incredibly lifelike. If you heard it on the radio, it mi

https://shkspr.mobi/blog/2021/07/synthetic-poetry/

#/etc/ #AI #Amazon #tts #turing

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Edent