from-text-to-speech-the-mit.../pages-txt/154.txt

From text to speech: The MITalk system

stant with little decrease in output sound quality. These higher frequency
resonators help to shape the overall spectrum, but otherwise contribute little to in-
telligibility. The particular values chosen for the fourth and fifth formant fre-
quencies (Table 12-1) produce an energy concentration around 3 to 3.5 kHz and a
rapid falloff in spectral energy above about 4 kHz, which is a pattern typical of

many talkers.

12.2.4 Formant bandwidths

Formant bandwidths are a function of energy losses due to heat conduction, vis-
cosity, cavity-wall motions, and radiation of sound from the lips and the real part
of the glottal source impedance. Bandwidths are difficult to deduce from analyses
of natural speech because of irregularities in the glottal source spectrum.
Bandwidths have been estimated by other techniques, such as using a sinusoidal
swept-tone sound source (Fujimura and Lindqvist, 1971). Results indicate that
bandwidths vary by a factor of two or more as a function of the particular phonetic
segment being spoken. Typical values for formant bandwidths are also given in
Chapter 11. Bandwidth variation is small enough so that all formant bandwidths
might be held constant in some applications, in which case only F1, F2, and F3
would be varied to simulate the vocal tract transfer functions for nonnasalized
vowels and sonorant consonants.

12.2.5 Nasals and nasalization of vowels

It is not possible to approximate nasal murmurs and the nasalization of vowels that
are adjacent to nasals with a cascade system of five resonators alone. More that
five formants are often present in these sounds and formant amplitudes do not con-
form to the relationships inherent in a cascade configuration because of the
presence of transfer function zeros (Fujimura, 1961, 1962). Typical transfer func-
tions for a nasal murmur and for a nasalized 11 are shown in Figure 12-10. These
spectra were obtained from the recorded syllable “dim”.

Nasalization introduces additional poles and zeros into the transfer function of
the vocal-nasal tract due to the presence of a side-branch resonator. In Figure
12-10, the nasal murmur and the nasalized 1H have an extra pole pair and zero pair
near F1. The oral cavity forms the side-branch resonator in the case of a nasal
murmur, while the nose should be considered a side-branch resonator in a nasal-
ized vowel (because the amount of sound radiated through the nostrils is insig-
nificant compared to the effect of the lowered velum on the formant structure of
the sound output from the lips).

Nasalization of adjacent vowels is an important element in the synthesis of
nasal consonants. Perceptually, the most important change associated with

142