You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

49 lines
2.7 KiB

From text to speech: The MITalk system
served formant amplitudes caused by factors such as glottal losses and ir-
regularities in the voicing source spectrum. Where two values are given, the vowel
is diphthongized or has a schwa-like offglide in the speech of talker DHK. Dura-
tions of steady states and transition portions of diphthongized vowels depend on
total vowel duration, and are different for each vowel.
The mechanism for synthesizing a diphthongized vowel is shown in Figure
11-7. Each of the constants shown in the figure is stored in tables for all diphthon-
gized vowels, including those having schwa offglides.
11.3.2 Consonants
Additional control parameters must be varied for the synthesis of various classes of
consonants. Table 11-2 includes target values for variable control parameters that
are used to synthesize portions of English consonants (frication spectra of frica-
tives, burst spectra of plosives, nasal murmurs for nasals, and steady portions of
sonorants).
The sonorant consonants Ww, YY, RR, and LL are similar to vowels and re-
quire the same set of control parameters to be varied in order to differentiate
among them. Formant values given in Table 11-2 for the prevocalic sonorants RR
and LL depend somewhat on the following vowel. The source amplitude, AV, for
a prevocalic sonorant should be about 10 dB less than in the vowel. The sonorant
HH can be synthesized by taking formant frequency and bandwidth parameters
from the following vowel, increasing the first formant bandwidth to about 300 Hz,
and replacing voicing by aspiration.
The fricatives characterized in Table 11-2 include both voiceless fricatives
(AF=60, AV=0, AVS=0) and voiced fricatives (AF=50, AV=47, AVS=47). For-
mants to be excited by the frication noise source are determined by the amplitude
controls A2, A3, A4, A5, A6, and AB. The amplitude of the parallel second for-
mant, A2, is zero for all of these consonants before front vowels, but the second
formant is a front cavity resonance for velars before nonfront vowels and A2 is
excited. The values given for F2 and F3 are not only valid during the fricative, but
also can serve as “loci” for the characterization of the consonant-vowel formant
transitions before front vowels. These are virtual loci in that formant frequency
values observed at the onset of glottal excitation are somewhere between the locus
and the vowel target frequency -- the amount of virtual transition being dependent
on formant-cavity affiliations.
The specification of frication spectra in the table is accurate only before front
vowels in the speech of talker DHK. Before back and rounded vowels, systematic
changes are observed to the fricative spectra because of anticipatory coarticulation.
118