|
|
From text to speech: The MITalk system
|
|
|
|
|
|
ized dental stop TQ (i.e. has a glottal release rather than a t-burst) if
|
|
|
|
|
|
the next word starts with a stressed sonorant (unless there is a clause
|
|
|
boundary between the words, in which case the TT is released into a
|
|
|
|
|
|
pause). Examples: “that one”, “Mat ran”.
|
|
|
|
|
|
4. A voiceless plosive is not released if the next phonetic segment is
|
|
|
another voiceless plosive within the same clause.
|
|
|
|
|
|
5. A glottal stop is inserted before a word-initial stressed vowel if the
|
|
|
preceding segment is syllabic (and not a determiner), or if the
|
|
|
preceding segment is a voiced nonplosive and there is an intervening
|
|
|
phrase boundary. Example: “Liz eats”.
|
|
|
|
|
|
6. The word “the” is pronounced DH 1IY if the next word starts with a
|
|
|
vowel.
|
|
|
|
|
|
8.5.1 An example
|
|
|
|
|
|
If the six rules of segmental phonology are applied to the sentence shown in Figure
|
|
|
8-1, three allophonic changes are made. The sixth rule replaces the schwa by 1Y
|
|
|
in the word “the”. The first rule replaces the phoneme LL by a postvocalic al-
|
|
|
lophone in the word “old”. Finally, the second rule replaces the TT in “sat” by an
|
|
|
alveolar flap px. The string of symbols in the lower portion of Figure 8-1 is thus a
|
|
|
broad phonetic transcription of the utterance to be synthesized. As the output of
|
|
|
the phonological component, it serves as the input to the prosodic component
|
|
|
|
|
|
PROSOD that is described in Chapter 9.
|
|
|
|
|
|
8.6 Pauses
|
|
|
|
|
|
Pauses are often used in speech production to mark major syntactic boundaries.
|
|
|
Both pauses and prepausal lengthening are important to guide the listener’s percep-
|
|
|
tion of the underlying syntactic structure of a sentence (Klatt, 1976b). A system of
|
|
|
rules has been worked out for determining the locations of pauses in the synthesis,
|
|
|
and the duration of each kind of pause.
|
|
|
|
|
|
Pauses of 800 msec, sufficient for a real speaker to take a breath, are intro-
|
|
|
duced after any sentence of more than five words. A longer pause of 1200 msec
|
|
|
appears at the end of paragraphs. Brief sentence-internal pauses (400 msec) are
|
|
|
triggered by punctuation marks contained in the text, or are inserted by PHONOL1
|
|
|
at detectable clause boundaries.
|
|
|
|
|
|
It is desirable to insert another kind of pause in certain sentence-internal posi-
|
|
|
tions of very long sentences because of the talker’s limited lung volume. An algo-
|
|
|
rithm has been developed for locating such pauses that is based on the number of
|
|
|
syllables on either side of the potential sentence-internal breath pause, and the
|
|
|
|
|
|
88
|