You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
48 lines
2.8 KiB
48 lines
2.8 KiB
From text to speech: The MITalk system
|
|
|
|
nasalization of a vowel is the reduction in amplitude of the first formant, brought
|
|
on by the presence of a nearby low-frequency pole pair and zero pair. The first
|
|
formant frequency also tends to shift slightly toward about 500 Hz.
|
|
|
|
Nasal murmurs and vowel nasalization are approximated by the insertion of
|
|
an additional resonator RNP and antiresonator RNZ into the cascade vocal tract
|
|
model. The nasal pole frequency FNP and zero frequency FINZ should be set to a
|
|
fixed value of about 250 Hz, but the frequency of the nasal zero must be increased
|
|
during the production of nasals and nasalization. Strategies for controlling FNZ
|
|
are given in Chapter 11. The RNP-RNZ pair is effectively removed from the cas-
|
|
cade circuit during the synthesis of nonnasalized speech sounds if FNP=FNZ.
|
|
|
|
12.2.6 Parallel vocal tract model for frication sources
|
|
|
|
During frication excitation, the vocal tract transfer function contains both poles
|
|
and zeros. The pole frequencies are temporally continuous with formant locations
|
|
of adjacent phonetic segments because, by definition, the poles are the natural
|
|
resonant frequencies of the entire vocal tract configuration, no matter where the
|
|
source is located. Thus, the use of vocalic formant frequency parameters to control
|
|
the locations of frication maxima is theoretically well-motivated (and helpful in
|
|
preventing the fricative noises from “dissociating” from the rest of the speech
|
|
signal).
|
|
|
|
The zeros in the transfer function for fricatives are the frequencies for which
|
|
the impedance (looking back toward the larynx from the position of the frication
|
|
source) is infinite, since the series-connected pressure source of turbulence noise
|
|
cannot produce any output volume velocity under these conditions. The effect of
|
|
transfer-function zeros is two-fold; they introduce notches in the spectrum and they
|
|
modify the amplitudes of the formants. The perceptual importance of spectral
|
|
notches is not great because masking effects of adjacent harmonics limit the detec-
|
|
tability of a spectral notch (Gauffin and Sundberg, 1974). We have found that a
|
|
satisfactory approximation to the vocal tract transfer function for frication excita-
|
|
tion can be achieved with a parallel set of digital formant resonators having
|
|
amplitude controls, and no antiresonators.
|
|
|
|
Formant amplitudes are set to provide frication excitation for selected for-
|
|
mants, usually those associated with the cavity in front of the constriction
|
|
|
|
(Stevens, 1972). The presence of any transfer function zeros is accounted for by
|
|
| appropriate settings of the formant amplitude controls. Relatively simple rules for
|
|
determination of the formant amplitude settings (and bypass path amplitude
|
|
values) as a function of place of articulation can be derived from a quantal theory
|
|
of speech production (Stevens, 1972). The theory states that only formants as-
|
|
|
|
144
|