blog dds: 2021-01-02 — Reviving the 1973 Unix text to voice translator

The early Research Edition Unix versions featured a program that would turn a stream of ASCII text into utterances that could be played by a voice synthesizer. The source code of this program was lost for years. Here’s the story of how I brought it back to life.

Finding the lost code

The (early 1973) Third Research Edition of Unix documented a program that would receive as input ASCII text and convert it into phonemes that could then be played by a Votrax voice synthesizer made by the Vocal Interface Division of Federal Screw Works. The program was written by M. D. McIlroy, who documented its operation in a detailed technical report.

Although the program appeared in the Unix manual pages up to the 1975 Sixth Research Edition, its source code was missing from the archives that had survived. Even its author lacked a copy.

Fortunately, in 2011, Jonathan Gevaryahu found most parts of the program’s source code in unallocated space of a Sixth Research Edition disk dump. (This means that the code was once stored on disk, but was later deleted, and the parts where it resided were never allocated to other uses.) Even better, he could reconstruct a single block that was missing from the program’s compiled version, which was also available. Based on these findings, I added the speak source code and the speech rules to the GiHub repository of Unix history I am maintaining.

Reviving the code

To see how the program was working, I experimented with making it run and compile. As the program was written in an ancient dialect of C and was also unlikely to be portable, I first tried to make it work on a Sixth Edition Unix running on a SIMH PDP-11 emulator. This attempt quickly failed, because the console wasn’t reliable enough to allow me to transfer the code via copy-paste.

I then run the PDP-11 2.11 BSD Unix on the same emulator, which offers rudimentary internet connection capabilities. After configuring a .rhosts file to allow remote copying (to obtain remote access, you simply add your remote host and user name), I was able to move the code to that machine.

However, compiling the code wasn’t immediately possible. To make it compile I

changed the old =+, =^ to the modern +=, ^= operators,
added forward declarations for functions returning pointers,
inserted an assignment operator in initialized constants (int tflag 0; became int tflag = 0; — I didn’t even know this form ever existed),
changed calls from seek to lseek, and
added a proper exit code to exit.

At that stage the program could compile, but was crashing when I tried to run it. Given that 2.11 BSD lacked gdb and was generally slow and difficult to use, I decided to port the program to modern Unix/Linux. I also added more declarations, including full function prototypes to find other problems. (In early versions of C you didn’t need to declare a function before using it.) I then methodically removed all compiler warnings, which allowed me to pinpoint a variable that was declared as a pointer but used as an integer. By correcting its declaration I fixed the initial crash.

Now I had a program that compiled and run, but was still crashing in some cases, and also wasn’t producing correct output. For this further changes were needed.

I replaced writing to a string with a write to a char array.
I corrected the size of a structure that was assumed to be 4.
I documented some functions to be able to follow the program logic.
I fixed the assumption that integers occupied two bytes.
I replaced integers initialized as pairs of characters (e.g. 'u1') with a macro that initialized the value in an endian-neutral manner.

After these changes the program was able to compile the rules file and produce Votrax phoneme codes.

Getting voice output

Votrax voice synthesizers and their descendant chips (which appear to use similar phoneme codes) are not longer marketed. In order to listen to the generated voice I needed a workaround. My first attempt was to use samples from the votrax-speak GitHub repository. Converting the phoneme Votrax codes into their mnemonic names, and passing the corresponding sample files as arguments to SoX, allowed me to create a sound file consisting of the phonemes played together. However, the generated sound file was almost unintelligible. As I read later, a great advantage of Votrax synthesizers was how they merged together the phonemes into continuous speech, which was not the case with my approach.

My second attempt involved using the phoneme output functionality of the espeak-ng program. For this I created a map between the Votrax phoneme codes and the corresponding espeak phonemes, which I then coded into a sed script that would feed espeak with the output of Unix speak. Through this method I was finally able to produce somewhat intelligible speech with a pipeline, such as the following.

echo Hello world |
speak speak.m |
LC_ALL=C ./votrax-espeak.sed |
espeak

Code availability

The revived source code is available in this GiHub repository.

Comments Post Toot! Tweet Share

Navigation

blog contents
dds blog
dds home
comments
« Fast database UPDATE/DELETE operations
» The Evolution of the Unix System Architecture

Tagged as

Become a Unix command line wizard

edX MOOC on Unix Tools: Data, Software, and Production Engineering

Debug like a master

Compute with style

Book cover of The Elements of Computing Style

Syndication

This blog is also available as an RSS feed: