from-text-to-speech-the-mit.../pages-txt/146.txt

From text to speech: The MITalk system

tal pulse, nor does it contain spectral zeros of the kind that often appear in natural
voicing, but neither of these differences is judged to be very important percep-
tually.

The antiresonator RGZ is used to modify the detailed shape of the spectrum
of the voicing source for particular individuals with greater precision than would
be possible using only a single low-pass filter. The values chosen for FGZ and
BGZ in Table 12-1 are such as to tilt the general voicing spectrum up somewhat to
match the vocal characteristics of speaker DHK. The waveform and spectral en-

velope of normal voicing that are produced by sending an impulse train through
RGP and RGZ are shown in Figure 12-7.

12.1.13 Quasi-sinusoidal voicing

The amplitude control parameter AVS determines the amount of smoothed voicing
generated during voiced fricatives, voiced aspirates, and the voicebars present in
intervocalic voiced plosives. An appropriate wave shape for quasi-sinusoidal voic-
ing is obtained by low-pass filtering an impulse by low-pass digital resonators
RGP and RGS. The frequency control of RGS is set to zero to produce a low-pass
filter, and BGS=200 determines the cutoff frequency beyond which harmonics are
strongly attenuated.

The waveform and spectral envelope of quasi-sinusoidal voicing are shown in
Figure 12-7. After the effects of the vocal tract transfer function and radiation
characteristic are imposed on the source spectrum, the output waveform of quasi-
sinusoidal voicing contains significant energy only at the first and second har-
monics of the fundamental frequency. AVS ranges from about 60 dB in a voicebar
or strongly voiced fricative to O dB if no quasi-sinusoidal voicing is present. Some
degree of quasi-sinusoidal voicing can be added to the normal voicing source (in

combination with aspiration noise) to produce a breathy voice quality (e.g.
AH=AV-3, AVS=AV-6).

12.1.14 Frication source

A turbulent noise source is simulated in the synthesizer by a pseudo-random num-
ber generator, a modulator, an amplitude control AF, and a -6 dB/octave low-pass
digital filter LPF, as shown in Figure 12-6. Theoretically, the spectrum of the
frication source should be approximately flat (Stevens, 1971), and the amplitude
distribution should be Gaussian. Signals produced by the random number gener-
ator have a flat spectrum, but they have a uniform amplitude distribution between
limits determined by the value of the amplitude control parameter AF. A pseudo-
Gaussian amplitude distribution is obtained in the synthesizer by summing 16 of
the numbers produced by the random number generator.

134