|
|
From text to speech: The MITalk system
|
|
|
|
|
|
The normal pattern is considered to be one of alternating accented and un-
|
|
|
|
|
|
accented syllables. An accented syllable is a stressed syllable of a content word;
|
|
|
unaccented syllables are all others. If two accented syllables are adjacent, their
|
|
|
rise values are reduced by 40 percent. Two accented syllables separated by two,
|
|
|
three, or four unaccented syllables have their rise values increased by 15 percent,
|
|
|
20 percent, and 30 percent, respectively. Additional unaccented syllables cause no
|
|
|
further effect. The peak height on an accented syllable preceded by two or three
|
|
|
unaccented syllables is decreased by 15 percent and 25 percent, respectively.
|
|
|
However, an accented syllable followed by two or three unaccented syllables is in-
|
|
|
creased by 10 percent and 15 percent, respectively. If three accented syllables ap-
|
|
|
pear in succession, the fundamental frequency of the second is allowed to fall from
|
|
|
the peak of the first, and rise into the peak of the third, i.e., its fall and rise are
|
|
|
interchanged in time. A word not covered by a node, and preceded by three or
|
|
|
more unaccented syllables, is assigned a rise value equal to the difference between
|
|
|
its peak value and 95 Hz.
|
|
|
|
|
|
Words in terminal positions are given special rise and fall values. In a state-
|
|
|
ment (Tune A), the last syllable is given a fall value such that FO reaches 75 Hz.
|
|
|
In a yes/no question (Tune B), a rise is assigned after the last accented syllable’s
|
|
|
fall (none if it is the last syllable), which gives a final FO value 20 percent higher
|
|
|
than any previous peak.
|
|
|
|
|
|
The highest continuation rise (16 Hz) is assigned to the last syllable of a
|
|
|
word, if it is followed by a nonterminal punctuation mark or a conjunction, and if
|
|
|
there has been no punctuation or conjunction since the last content word. A con-
|
|
|
tinuation rise of 8 Hz is assigned to the last syllable of the last word in a nonfinal
|
|
|
phrase, if there have been more than five words since the last word to which a con-
|
|
|
tinuation rise was assigned.
|
|
|
|
|
|
If two accented syllables are separated by unaccented syllables, the FO con-
|
|
|
tour connecting them is either straight or falling. If the difference between the
|
|
|
endpoints of the two accented syllables is positive, the previous fall and next rise
|
|
|
are adjusted by the same amount (half the difference), so that the FO contour does
|
|
|
not change on intermediate unaccented syllables.
|
|
|
|
|
|
In the case in which the difference in endpoints is negative, that fall is spread
|
|
|
over the intermediate unaccented syllables in two ways. If the unaccented syll-
|
|
|
ables occur within a phrase, the falling rate is linear. Each successive unaccented
|
|
|
syllable gets an equal share of the fall. For unaccented syllables which are not in
|
|
|
the same phrase, a more exponential falling pattern is assigned with the earlier un-
|
|
|
accented syllables receiving more of the fall. Unaccented syllables terminating ei-
|
|
|
ther a Tune A or Tune B clause, fall or rise in equal amounts to the final value.
|
|
|
|
|
|
106
|