[Tiptoi] Text-to-speech

Joachim Breitner mail at joachim-breitner.de
Do Jan 22 09:51:20 CET 2015


Hi,

Am Donnerstag, den 22.01.2015, 01:00 +0100 schrieb Ulrich Sibiller:
> Now to the main topic of this thread, the text-to-speech functionality:
> 
> uli at traube:~/work/tiptoi/tip-toi-reveng$ ./tttool assemble text2speech.yaml
> Speaking "Hello!".
> Speaking "We are in mode one.".
> Speaking "We are in mode two.".
> uli at traube:~/work/tiptoi/tip-toi-reveng$ ls -lart *.ogg
> -rw-rw-r-- 1 uli uli 6987 Jan 22 00:43 tts-Hello!.ogg
> -rw-rw-r-- 1 uli uli 9542 Jan 22 00:43 tts-We are in mode one..ogg
> -rw-rw-r-- 1 uli uli 9200 Jan 22 00:43 tts-We are in mode two..ogg
> 
> I suggest to not use the spoken text as the filename but the label
> that is used within the yaml. A text can be really long, a label
> normally isn't....

I choice the text because this way, the files can be shared between
multiple yaml files, of if the same text is used for different labels.
After all, they are independent of the actual project.

I might put them into a subdirectory tts-cache/ instead of just
prefixing them with tts-.

I might shorten the name if it becomes too long, using a hash of the
rest.

...implemented both of this.

> It might be easier to skip the label thing altogether and do it like this:
> 
> ----------------------------------------
> scripts:
>   8066:
>   - $mode==1? P("We are in mode one")
>   - $mode==2? P(mode2:"We are in mode two")
>   - $mode==3? P(mode3)
>   - $mode==4? P(mode4)
> 
> speak:
>    mode4: "We are in mode 4"
> ----------------------------------------
> 
> The example uses for different different ways:
> 1. define a text to be spoken (and generated) by using double quotes
> 2. like 1, but prepend the filename that will be used
> 3. use file "mode3.ogg"
> 4. current way

I thought about this. The advantage of the current design is that you
can switch from tts to real files without modifying your scripts at all.
It also means that the syntax of the scripts stays the same.

Ok, it’s mainly because it was less work for me this way :-)

> Another idea was to not directly call pico2wav but to call an external
> script that expects a defined set of parameters. This way it might be
> easier to accomplish this on other OSes and the user can also use some
> other tools.

Hmm. I’ll wait until there are more than a few combinations to support.
I’d rather have tttool do the right thing out of the box, so if there
are different tools to support, I’ll wait for people to tell me how to
call them and implement it in the TextToSpeech-module.

Greetings,
Joachim


-- 
Joachim “nomeata” Breitner
  mail at joachim-breitner.dehttp://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- nächster Teil --------------
Ein Dateianhang mit Binärdaten wurde abgetrennt...
Dateiname   : signature.asc
Dateityp    : application/pgp-signature
Dateigröße  : 819 bytes
Beschreibung: This is a digitally signed message part
URL         : <https://lists.nomeata.de/pipermail/tiptoi/attachments/20150122/ab08868d/attachment.asc>


Mehr Informationen über die Mailingliste tiptoi