Synthesizer

The synthesizer that is bundled in this repo is a toy synthesizer of my voice telling the time in English. I built it using Festival and following some freely available documentation and guides such as these lecture slides from Andrew Maas and this tutorial by Alan Black at CMU. Speech Zone is also an invaluable resource if you are beginning your learning into speech synthesis.

There are likely 3 things that you might want to do to adapt this starter, so I will provide short guides for each. First, you might want to

1. Use your voice for an English talking clock

Here are the steps to use your voice instead of mine for the talking clock:

  1. Record audio for the 24 sentences in model/eng_clock/etc/txt.done.data. I suggest using CSTR’s Speech Recorder - visit the Speech Recorder Documentation for instructions on how to use it. They must be labelled with the same format, (ie, time0001, time0002 etc). I recommend 48KHz sample rate.
  2. Move audio files to model/eng_clock/recordings
  3. Change directories, cd model/eng_clock
  4. Get the wavs (move to the right directly and downsample) ./bin/get_wavs recording/*.wav
  5. Prune silence ./bin/prune_silence. Look at the files in model/eng_clock/wavs and see if they were pruned too much. You can fix these manually if needed.
  6. Make label files ./bin/make_labs prompt-wav/*.wav
  7. Build utterance structure $FESTIVALDIR/bin/festival -b festvox/build_ldom.scm '(build_utts "etc/txt.done.data")'
  8. Extract pitchmarks ./bin/make_pm_wave wav/*.wav
  9. Fix pitchmarks ./bin/make_pm_fix pm/*.pm
  10. Power normalize ./bin/simple_powernormalize wav/*.wav
  11. Get MCEP vectors ./bin/make_mcep wav/*.wav
  12. Build synthesizer $FESTIVALDIR/bin/festival -b festvox/build_ldom.scm '(build_clunits "etc/txt.done.data")'
  13. Commit your changes, and either build your docker container again and run locally or push to heroku. Check the start guide for more info.

2. Build a talking clock for a different language

…coming…

3. Build a different synthesizer that ISN’T a talking clock

…coming…