You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

63 lines
1.6 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

From text to speech: The MITalk system
BW2=100 BW2=50
20 Uniform tube 20 \ BW2=200
10 10 /
0 0
-10
1 2 3 4 5 0 1 2 3 4 5
) (b)
0
(a
o
o
Transfer function |T(f)| (dB)
(O}
o
30
20 ' F1=500 20
10 10 / ”\ 3
0 0 T
-10 250 -10 2
-20 -20
0 1 2 3 4 5 0 1 2 3 4 5
Frequency (kHz) Frequency (kHz)
(c) (d)
Figure 12-11: Effect of parameter changes on the vocal tract transfer function
where formant frequencies are set to 500, 1500, 2500, 3500, and
4500 Hz and formant bandwidths are set to be equal at 100 Hz. This
corresponds to a vocal tract having a uniform cross-sectional area, a
closed glottis, open lips (and a nonrealistic set of bandwidth values),
as shown in part (a) of Figure 12-11.
2. The amplitude of a formant peak is inversely proportional to its
bandwidth. If a formant bandwidth is doubled, that formant peak is
reduced in amplitude by 6 dB. If the bandwidth is halved, the peak is
increased by 6 dB, as shown in part (b) of Figure 12-11.
3. The amplitude of a formant peak is proportional to formant fre-
quency. If a formant frequency is doubled, that formant peak is in-
creased by 6 dB, as shown in part (c) of Figure 12-11. (This is true
of T(f), but not of the resulting speech output spectrum since the
glottal source spectrum falls off at about -12 dB/octave of frequency
increase, and the radiation characteristic imposes a +6 dB/octave
spectral tilt resulting in a net change in formant amplitude of +6 -12
+6 =0dB.)
4. Changes to a formant frequency also affect the amplitudes of higher
formant peaks by a factor proportional to frequency squared. For ex-
146
)