You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

47 lines
2.8 KiB

From text to speech: The MITalk system
way. Most recently, the entire system has been converted to run under UNIX,
written mainly in PASCAL with some routines in C. This is a highly flexible sys-
tem, and introduces a new overall control program which allows various subsets of
the system to be effectively utilized. There is also an ability to monitor the system
at several levels of detail, thus providing the user with substantial insight into the
overall workings of the system. This version of the system, which is the basis for
current distribution from MIT, is described in detail later in this chapter.
14.3 Performance system
The structure of the development system, even in its contemporary UNIX im-
plementation, is not suitable for compact, real-time, and economical utilization in
practical contexts. For such uses, less flexibility is required, and special purpose
hardware is necessary. For example, the lexicons and rule bases can be stored in
high-density memory without the necessity for utilizing electromechanical disks.
A general purpose microprocessor can be utilized to provide overall system control
and to provide the linguistic analysis and prosodic synthesis up to the level of the
phonemic synthesis conversion to the output speech waveform. Finally, a signal
processing chip can perform all of the phonemic synthesis to waveform conversion
in real-time, thus meeting the overall requirements of a practical, high-
performance system. Current commercial systems, many of which are based on
the licensing of the MITalk system, readily provide this capability. It is important
to emphasize that there are no significant hardware limitations to the real-time and
economic usage of the entire span of MITalk algorithms. In the past, concerns
were expressed about the size of the lexicon and the real-time signal processing
requirements, but these requirements pose no difficulties for modern technology.
In the future, one can conceive of the entire MITalk system implemented on a
single integrated-circuit wafer, or in a small set of chips. In this way, ASCII
characters can be converted to output speech waveforms in many different en-
vironments, including highly compact terminals. While a wafer-scale system must
be viewed as highly aggressive technology in the mid-1980s, there is no inherent
difficulty in achieving such a system. There is no question that highly complex
and capable text-to-speech systems will be available in such compact formats in
the near future.
14.4 UNIX implementation
As mentioned above, the present version of the development system consists of a
set of PASCAL and C programs which run in a UNIX operating system environ-
ment. There is one program per speech processing module described in previous
chapters. In addition, there is a coordinator program which serves as the user-
174