[Tiptoi] Text-to-speech

Do Jan 22 01:00:29 CET 2015

On Thu, Jan 22, 2015 at 12:12 AM, Ulrich Sibiller
<ulrich.sibiller at gmail.com> wrote:
> Can you provide some hints how to clean up the Haskell environment and
> start from scratch?

I finally have been successful (sorry, I have only the command line
history, the output has scrolled out... I added some comments on the
output):

sudo apt-get remove cabal-install
sudo apt-get remove ghc
rm -rf /home/uli/.cabal
sudo apt-get update
sudo apt-get install cabal-install
ghc-pkg check
rm .travis.yml
cabal update
cabal install --bindir=.

-> "will possibly break some packages - use force-reinstall"

cabal install --bindir=. --force-reinstalls

-> package time failed

cabal install --bindir=. --force-reinstalls time

-> failed again, this time I could see the reason: --bindir must be an
absolute path
cabal install --bindir=`pwd` --force-reinstalls time

cabal install --bindir=`pwd`
cabal install --bindir=`pwd` --force-reinstalls

-> now it worked

Now to the main topic of this thread, the text-to-speech functionality:

uli at traube:~/work/tiptoi/tip-toi-reveng$ ./tttool assemble text2speech.yaml
Speaking "Hello!".
Speaking "We are in mode one.".
Speaking "We are in mode two.".
uli at traube:~/work/tiptoi/tip-toi-reveng$ ls -lart *.ogg
-rw-rw-r-- 1 uli uli 6987 Jan 22 00:43 tts-Hello!.ogg
-rw-rw-r-- 1 uli uli 9542 Jan 22 00:43 tts-We are in mode one..ogg
-rw-rw-r-- 1 uli uli 9200 Jan 22 00:43 tts-We are in mode two..ogg

I suggest to not use the spoken text as the filename but the label
that is used within the yaml. A text can be really long, a label
normally isn't....
Also I think putting the samples in a media subdirectory would be better.

Again you were very fast (great!) and implemented that before I could
add some ideas I had to the issue. But better late than never:

Instead of
----------------------------------------
scripts:
  8066:
  - $mode==1? P(mode_one)
  - $mode==2? P(mode_two)

# But instead of manually creating files hello.ogg, mode_one.ogg and
# mode_two.ogg, you can specify text to be spoken for them.
speak:
  hello:   "Hello!"
  mode_one: "We are in mode one."
  mode_two: "We are in mode two."
----------------------------------------

It might be easier to skip the label thing altogether and do it like this:

----------------------------------------
scripts:
  8066:
  - $mode==1? P("We are in mode one")
  - $mode==2? P(mode2:"We are in mode two")
  - $mode==3? P(mode3)
  - $mode==4? P(mode4)

speak:
   mode4: "We are in mode 4"
----------------------------------------

The example uses for different different ways:
1. define a text to be spoken (and generated) by using double quotes
2. like 1, but prepend the filename that will be used
3. use file "mode3.ogg"
4. current way

In mode 1 tttool should be able to create a unique filename by itself
(derived from the string).

Another idea was to not directly call pico2wav but to call an external
script that expects a defined set of parameters. This way it might be
easier to accomplish this on other OSes and the user can also use some
other tools.

Uli