You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
47 lines
2.8 KiB
47 lines
2.8 KiB
From text to speech: The MITalk system
|
|
|
|
way. Most recently, the entire system has been converted to run under UNIX,
|
|
written mainly in PASCAL with some routines in C. This is a highly flexible sys-
|
|
tem, and introduces a new overall control program which allows various subsets of
|
|
the system to be effectively utilized. There is also an ability to monitor the system
|
|
at several levels of detail, thus providing the user with substantial insight into the
|
|
overall workings of the system. This version of the system, which is the basis for
|
|
current distribution from MIT, is described in detail later in this chapter.
|
|
|
|
14.3 Performance system
|
|
|
|
The structure of the development system, even in its contemporary UNIX im-
|
|
plementation, is not suitable for compact, real-time, and economical utilization in
|
|
practical contexts. For such uses, less flexibility is required, and special purpose
|
|
hardware is necessary. For example, the lexicons and rule bases can be stored in
|
|
high-density memory without the necessity for utilizing electromechanical disks.
|
|
A general purpose microprocessor can be utilized to provide overall system control
|
|
and to provide the linguistic analysis and prosodic synthesis up to the level of the
|
|
phonemic synthesis conversion to the output speech waveform. Finally, a signal
|
|
processing chip can perform all of the phonemic synthesis to waveform conversion
|
|
in real-time, thus meeting the overall requirements of a practical, high-
|
|
performance system. Current commercial systems, many of which are based on
|
|
the licensing of the MITalk system, readily provide this capability. It is important
|
|
to emphasize that there are no significant hardware limitations to the real-time and
|
|
economic usage of the entire span of MITalk algorithms. In the past, concerns
|
|
were expressed about the size of the lexicon and the real-time signal processing
|
|
requirements, but these requirements pose no difficulties for modern technology.
|
|
|
|
In the future, one can conceive of the entire MITalk system implemented on a
|
|
single integrated-circuit wafer, or in a small set of chips. In this way, ASCII
|
|
characters can be converted to output speech waveforms in many different en-
|
|
vironments, including highly compact terminals. While a wafer-scale system must
|
|
be viewed as highly aggressive technology in the mid-1980s, there is no inherent
|
|
difficulty in achieving such a system. There is no question that highly complex
|
|
and capable text-to-speech systems will be available in such compact formats in
|
|
the near future.
|
|
|
|
14.4 UNIX implementation
|
|
|
|
As mentioned above, the present version of the development system consists of a
|
|
set of PASCAL and C programs which run in a UNIX operating system environ-
|
|
ment. There is one program per speech processing module described in previous
|
|
chapters. In addition, there is a coordinator program which serves as the user-
|
|
|
|
174
|