[Tiptoi] Text-to-speech

Ulrich Sibiller ulrich.sibiller at gmail.com
Do Jan 22 01:00:29 CET 2015

On Thu, Jan 22, 2015 at 12:12 AM, Ulrich Sibiller
<ulrich.sibiller at gmail.com> wrote:
> Can you provide some hints how to clean up the Haskell environment and
> start from scratch?

I finally have been successful (sorry, I have only the command line
history, the output has scrolled out... I added some comments on the

sudo apt-get remove cabal-install
sudo apt-get remove ghc
rm -rf /home/uli/.cabal
sudo apt-get update
sudo apt-get install cabal-install
ghc-pkg check
rm .travis.yml
cabal update
cabal install --bindir=.

-> "will possibly break some packages - use force-reinstall"

cabal install --bindir=. --force-reinstalls

-> package time failed

cabal install --bindir=. --force-reinstalls time

-> failed again, this time I could see the reason: --bindir must be an
absolute path
cabal install --bindir=`pwd` --force-reinstalls time

cabal install --bindir=`pwd`
cabal install --bindir=`pwd` --force-reinstalls

-> now it worked

Now to the main topic of this thread, the text-to-speech functionality:

uli at traube:~/work/tiptoi/tip-toi-reveng$ ./tttool assemble text2speech.yaml
Speaking "Hello!".
Speaking "We are in mode one.".
Speaking "We are in mode two.".
uli at traube:~/work/tiptoi/tip-toi-reveng$ ls -lart *.ogg
-rw-rw-r-- 1 uli uli 6987 Jan 22 00:43 tts-Hello!.ogg
-rw-rw-r-- 1 uli uli 9542 Jan 22 00:43 tts-We are in mode one..ogg
-rw-rw-r-- 1 uli uli 9200 Jan 22 00:43 tts-We are in mode two..ogg

I suggest to not use the spoken text as the filename but the label
that is used within the yaml. A text can be really long, a label
normally isn't....
Also I think putting the samples in a media subdirectory would be better.

Again you were very fast (great!) and implemented that before I could
add some ideas I had to the issue. But better late than never:

Instead of
  - $mode==1? P(mode_one)
  - $mode==2? P(mode_two)

# But instead of manually creating files hello.ogg, mode_one.ogg and
# mode_two.ogg, you can specify text to be spoken for them.
  hello:   "Hello!"
  mode_one: "We are in mode one."
  mode_two: "We are in mode two."

It might be easier to skip the label thing altogether and do it like this:

  - $mode==1? P("We are in mode one")
  - $mode==2? P(mode2:"We are in mode two")
  - $mode==3? P(mode3)
  - $mode==4? P(mode4)

   mode4: "We are in mode 4"

The example uses for different different ways:
1. define a text to be spoken (and generated) by using double quotes
2. like 1, but prepend the filename that will be used
3. use file "mode3.ogg"
4. current way

In mode 1 tttool should be able to create a unique filename by itself
(derived from the string).

Another idea was to not directly call pico2wav but to call an external
script that expects a defined set of parameters. This way it might be
easier to accomplish this on other OSes and the user can also use some
other tools.


Mehr Informationen über die Mailingliste tiptoi