You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

48 lines
2.9 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

From text to speech: The MITalk system
The normal pattern is considered to be one of alternating accented and un-
accented syllables. An accented syllable is a stressed syllable of a content word;
unaccented syllables are all others. If two accented syllables are adjacent, their
rise values are reduced by 40 percent. Two accented syllables separated by two,
three, or four unaccented syllables have their rise values increased by 15 percent,
20 percent, and 30 percent, respectively. Additional unaccented syllables cause no
further effect. The peak height on an accented syllable preceded by two or three
unaccented syllables is decreased by 15 percent and 25 percent, respectively.
However, an accented syllable followed by two or three unaccented syllables is in-
creased by 10 percent and 15 percent, respectively. If three accented syllables ap-
pear in succession, the fundamental frequency of the second is allowed to fall from
the peak of the first, and rise into the peak of the third, i.e., its fall and rise are
interchanged in time. A word not covered by a node, and preceded by three or
more unaccented syllables, is assigned a rise value equal to the difference between
its peak value and 95 Hz.
Words in terminal positions are given special rise and fall values. In a state-
ment (Tune A), the last syllable is given a fall value such that FO reaches 75 Hz.
In a yes/no question (Tune B), a rise is assigned after the last accented syllables
fall (none if it is the last syllable), which gives a final FO value 20 percent higher
than any previous peak.
The highest continuation rise (16 Hz) is assigned to the last syllable of a
word, if it is followed by a nonterminal punctuation mark or a conjunction, and if
there has been no punctuation or conjunction since the last content word. A con-
tinuation rise of 8 Hz is assigned to the last syllable of the last word in a nonfinal
phrase, if there have been more than five words since the last word to which a con-
tinuation rise was assigned.
If two accented syllables are separated by unaccented syllables, the FO con-
tour connecting them is either straight or falling. If the difference between the
endpoints of the two accented syllables is positive, the previous fall and next rise
are adjusted by the same amount (half the difference), so that the FO contour does
not change on intermediate unaccented syllables.
In the case in which the difference in endpoints is negative, that fall is spread
over the intermediate unaccented syllables in two ways. If the unaccented syll-
ables occur within a phrase, the falling rate is linear. Each successive unaccented
syllable gets an equal share of the fall. For unaccented syllables which are not in
the same phrase, a more exponential falling pattern is assigned with the earlier un-
accented syllables receiving more of the fall. Unaccented syllables terminating ei-
ther a Tune A or Tune B clause, fall or rise in equal amounts to the final value.
106