[Tiptoi] Text-to-speech

Do Jan 22 19:15:16 CET 2015

On Thu, Jan 22, 2015 at 9:51 AM, Joachim Breitner
<mail at joachim-breitner.de> wrote:
> I choice the text because this way, the files can be shared between
> multiple yaml files, of if the same text is used for different labels.
> After all, they are independent of the actual project.

Ok, that's a valid point...

> I might put them into a subdirectory tts-cache/ instead of just
> prefixing them with tts-.
>
> I might shorten the name if it becomes too long, using a hash of the
> rest.
>
> ...implemented both of this.

again: very quick ;-)

> I thought about this. The advantage of the current design is that you
> can switch from tts to real files without modifying your scripts at all.
> It also means that the syntax of the scripts stays the same.

I agree, this is very good point!

> Ok, it’s mainly because it was less work for me this way :-)

;-)

>> Another idea was to not directly call pico2wav but to call an external
>> script that expects a defined set of parameters. This way it might be
>> easier to accomplish this on other OSes and the user can also use some
>> other tools.
>
> Hmm. I’ll wait until there are more than a few combinations to support.
> I’d rather have tttool do the right thing out of the box, so if there
> are different tools to support, I’ll wait for people to tell me how to
> call them and implement it in the TextToSpeech-module.

If you prefer tttool to be the one binary that contains everything I
can understand that. But due to the nature of tttool (being a Haskell
program) there are very few people who are able to adjust the code if
they want some different behaviour than the implemented one (you are
one of two people I know that can program in Haskell!). So providing
tttool-tts.sh that is called by ttool would make adjustments much
easier for most people!

Instead of pico2wave you can also use espeak and/or mbrola to produce voices:

espeak -v mb-de5 -s 120 "Ravensburger" --pho | mbrola -e
/usr/share/mbrola/de5/de5 - x.wav

or

espeak -v mb-de5 -s 120 "Ravensburger" --stdout >x.wav

Uli