|
|
The phonological component
|
|
|
|
|
|
strength of various boundaries in the vicinity of a desired break. If necessary, such
|
|
|
pauses may be inserted between a content and function word, even if no phrase
|
|
|
boundary has been detected by syntactic analysis routines.
|
|
|
|
|
|
8.7 Evaluation of the analysis modules
|
|
|
The following text was input to the analysis modules of MITalk:
|
|
|
|
|
|
“This recording is a demonstration of speech synthesis by rule
|
|
|
|
|
|
and automatic text-to-speech conversion.
|
|
|
|
|
|
Applications for synthetic speech output fall into four broad
|
|
|
categories: those applications that require (1) a single word response
|
|
|
(e.g. Speak and Spell), (2) a limited set of messages with a rigid syn-
|
|
|
tactic framework (e.g. telephone number information), (3) a large
|
|
|
vocabulary with general English syntax (e.g. teaching machine
|
|
|
lessons), and (4) fully general English text to speech (e.g. for a reading
|
|
|
machine for the blind).
|
|
|
|
|
|
Prerecorded messages work well for single word response ap-
|
|
|
plications, but an increasing knowledge of the acoustic-phonetic
|
|
|
characteristics of speech, of phonology, and of syntax is required for
|
|
|
satisfactory synthesis of general English. In order to generate a par-
|
|
|
ticular utterance, one must specify a phonemic representation for each
|
|
|
word, a stress pattern for each word, certain aspects of the syntactic
|
|
|
structure of the sentence, such as the locations of phrase and clause
|
|
|
boundaries, and the locations of any words that are to receive semantic
|
|
|
|
|
|
focus.
|
|
|
|
|
|
This information could be typed into a computer terminal, as was
|
|
|
done in this case, or the information might be generated automatically
|
|
|
from a deep-structure representation of the concept to be expressed.
|
|
|
The speech that you have just heard was produced in June 1979 by the
|
|
|
synthesis-by-rule portions of the MITalk text-to-speech system that is
|
|
|
being developed at MIT”.
|
|
|
|
|
|
The output from PHONO2 is given below. Erroneous segments are under-
|
|
|
lined and the corrections are given as subscripts. A null subscript () means that
|
|
|
the segment should be deleted.
|
|
|
|
|
|
F: DH "IH SS C: RR IH,y KK 'OXR DD * IH NG )C,,! F: IH 22 F:
|
|
|
AX C: DD "EH MM AX NN * SS TT RR 'EY SH AX NN F: AX VV C:
|
|
|
SS PP ‘IY CH SH C: SS 'IH NN TH AX SS AX SS F: BB AY C: RR 'UW IX
|
|
|
)C F: AE NN C: "AO DX AX MM ‘AE DX IH KK C: TT 'EH KK SS TIT F:
|
|
|
TT AX C: SS PP YIY CH SH C: KK AX NN VV 'ER 2H AX NN . C:
|
|
|
"AE PP LL IH,, KK *z? 'EY SH AX NN * zZ F: FF OXR C:
|
|
|
SS IH NN TH 'EH DX IH KK AXPyz )Cy C: SS PP ‘IY CH SH C:
|
|
|
|
|
|
1To0 many extra “) c” pauses added.
|
|
|
|
|
|
2Morph boundary between root and bound morph has detrimental effect.
|
|
|
|
|
|
89
|