You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

48 lines
2.7 KiB

From text to speech: The MITalk system
less predictable. The amount of FO movement on each word depends upon its rank
in the order of parts of speech of content words (see Table 10-1) and also upon the
number of syllables in the word. Words of higher rank contain larger FO excur-
sion. Function words and unstressed syllables of content words are given a slight
(5 Hz) excursion to produce a more natural-sounding contour.
10.4.6 Prosodic indicators
A set of “prosodic indicators” is passed from the High Level System to the Low
Level System. An accent number gives the relative importance of a word. This
number ranges from “0” for one-syllable articles to “11+n” for a sentential adverb
containing n syllables. An integer representing the position of a word in a phrase
and the importance of that phrase is also assigned. Higher absolute values are
given to words at boundaries marked by punctuation and to words at the boun-
daries of large or major phrases. Another value assigned to each word is a number
indicating the amount of continuation rise. Most words are assigned the value “0”,
but those words ending a nonfinal phrase are usually given a value which reflects
the importance of the syntactic boundary which the word immediately precedes. A
level number applies to words in noun phrases not containing conjunctions. This
number either signifies that the FO level is to rise, or that the FO level should drop
on that word. Other words are given level “0”. This indicates a mid-phrase word.
Additionally, the tune value is defined on each word, and is nonzero on the word
ending a clause. The number of phrases is also a necessary input value to the next
level.
10.4.7 The Low Level System
This level reflects the effects of phonemics, lexical stress, and the number of syll-
ables of the words in the utterance. The number of syllables is used in determining
the height of the peak on lexically stressed syllables. Although the first and
highest peak in a sentence is constrained to a maximum of about 190 Hz, longer
sentences, i.e., sentences with more syllables, begin with higher peaks. This initial
height allows more freedom of excursion for following peaks. Higher peaks are
also placed on two lexically stressed syllables if they are separated by unstressed
syllables, the height of the peaks being dependent upon the number of intervening
unstressed syllables.
The FO pattern is also affected by the phonemics. For example, unvoiced
consonants at the beginning of a stressed syllable also cause the contour to fall,
rather than rise, into the contour of the stressed vowel. (The rise is added to the
peak of the vowel.) See Figure 10-1 for an example of this contour.
The algorithm first sets the peaks on the lexically stressed syllables. Falls and
104