You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
49 lines
2.7 KiB
49 lines
2.7 KiB
From text to speech: The MITalk system
|
|
|
|
stant with little decrease in output sound quality. These higher frequency
|
|
resonators help to shape the overall spectrum, but otherwise contribute little to in-
|
|
telligibility. The particular values chosen for the fourth and fifth formant fre-
|
|
quencies (Table 12-1) produce an energy concentration around 3 to 3.5 kHz and a
|
|
rapid falloff in spectral energy above about 4 kHz, which is a pattern typical of
|
|
|
|
many talkers.
|
|
|
|
12.2.4 Formant bandwidths
|
|
|
|
Formant bandwidths are a function of energy losses due to heat conduction, vis-
|
|
cosity, cavity-wall motions, and radiation of sound from the lips and the real part
|
|
of the glottal source impedance. Bandwidths are difficult to deduce from analyses
|
|
of natural speech because of irregularities in the glottal source spectrum.
|
|
Bandwidths have been estimated by other techniques, such as using a sinusoidal
|
|
swept-tone sound source (Fujimura and Lindqvist, 1971). Results indicate that
|
|
bandwidths vary by a factor of two or more as a function of the particular phonetic
|
|
segment being spoken. Typical values for formant bandwidths are also given in
|
|
Chapter 11. Bandwidth variation is small enough so that all formant bandwidths
|
|
might be held constant in some applications, in which case only F1, F2, and F3
|
|
would be varied to simulate the vocal tract transfer functions for nonnasalized
|
|
vowels and sonorant consonants.
|
|
|
|
12.2.5 Nasals and nasalization of vowels
|
|
|
|
It is not possible to approximate nasal murmurs and the nasalization of vowels that
|
|
are adjacent to nasals with a cascade system of five resonators alone. More that
|
|
five formants are often present in these sounds and formant amplitudes do not con-
|
|
form to the relationships inherent in a cascade configuration because of the
|
|
presence of transfer function zeros (Fujimura, 1961, 1962). Typical transfer func-
|
|
tions for a nasal murmur and for a nasalized 11 are shown in Figure 12-10. These
|
|
spectra were obtained from the recorded syllable “dim”.
|
|
|
|
Nasalization introduces additional poles and zeros into the transfer function of
|
|
the vocal-nasal tract due to the presence of a side-branch resonator. In Figure
|
|
12-10, the nasal murmur and the nasalized 1H have an extra pole pair and zero pair
|
|
near F1. The oral cavity forms the side-branch resonator in the case of a nasal
|
|
murmur, while the nose should be considered a side-branch resonator in a nasal-
|
|
ized vowel (because the amount of sound radiated through the nostrils is insig-
|
|
nificant compared to the effect of the lowered velum on the formant structure of
|
|
the sound output from the lips).
|
|
|
|
Nasalization of adjacent vowels is an important element in the synthesis of
|
|
nasal consonants. Perceptually, the most important change associated with
|
|
|
|
142
|