You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

57 lines
2.5 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

The phonological component
strength of various boundaries in the vicinity of a desired break. If necessary, such
pauses may be inserted between a content and function word, even if no phrase
boundary has been detected by syntactic analysis routines.
8.7 Evaluation of the analysis modules
The following text was input to the analysis modules of MITalk:
“This recording is a demonstration of speech synthesis by rule
and automatic text-to-speech conversion.
Applications for synthetic speech output fall into four broad
categories: those applications that require (1) a single word response
(e.g. Speak and Spell), (2) a limited set of messages with a rigid syn-
tactic framework (e.g. telephone number information), (3) a large
vocabulary with general English syntax (e.g. teaching machine
lessons), and (4) fully general English text to speech (e.g. for a reading
machine for the blind).
Prerecorded messages work well for single word response ap-
plications, but an increasing knowledge of the acoustic-phonetic
characteristics of speech, of phonology, and of syntax is required for
satisfactory synthesis of general English. In order to generate a par-
ticular utterance, one must specify a phonemic representation for each
word, a stress pattern for each word, certain aspects of the syntactic
structure of the sentence, such as the locations of phrase and clause
boundaries, and the locations of any words that are to receive semantic
focus.
This information could be typed into a computer terminal, as was
done in this case, or the information might be generated automatically
from a deep-structure representation of the concept to be expressed.
The speech that you have just heard was produced in June 1979 by the
synthesis-by-rule portions of the MITalk text-to-speech system that is
being developed at MIT”.
The output from PHONO2 is given below. Erroneous segments are under-
lined and the corrections are given as subscripts. A null subscript () means that
the segment should be deleted.
F: DH "IH SS C: RR IH,y KK 'OXR DD * IH NG )C,,! F: IH 22 F:
AX C: DD "EH MM AX NN * SS TT RR 'EY SH AX NN F: AX VV C:
SS PP IY CH SH C: SS 'IH NN TH AX SS AX SS F: BB AY C: RR 'UW IX
)C F: AE NN C: "AO DX AX MM AE DX IH KK C: TT 'EH KK SS TIT F:
TT AX C: SS PP YIY CH SH C: KK AX NN VV 'ER 2H AX NN . C:
"AE PP LL IH,, KK *z? 'EY SH AX NN * zZ F: FF OXR C:
SS IH NN TH 'EH DX IH KK AXPyz )Cy C: SS PP IY CH SH C:
1To0 many extra “) c” pauses added.
2Morph boundary between root and bound morph has detrimental effect.
89