You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
49 lines
2.7 KiB
49 lines
2.7 KiB
From text to speech: The MITalk system
|
|
|
|
served formant amplitudes caused by factors such as glottal losses and ir-
|
|
regularities in the voicing source spectrum. Where two values are given, the vowel
|
|
is diphthongized or has a schwa-like offglide in the speech of talker DHK. Dura-
|
|
tions of steady states and transition portions of diphthongized vowels depend on
|
|
total vowel duration, and are different for each vowel.
|
|
|
|
The mechanism for synthesizing a diphthongized vowel is shown in Figure
|
|
11-7. Each of the constants shown in the figure is stored in tables for all diphthon-
|
|
gized vowels, including those having schwa offglides.
|
|
|
|
11.3.2 Consonants
|
|
|
|
Additional control parameters must be varied for the synthesis of various classes of
|
|
consonants. Table 11-2 includes target values for variable control parameters that
|
|
are used to synthesize portions of English consonants (frication spectra of frica-
|
|
tives, burst spectra of plosives, nasal murmurs for nasals, and steady portions of
|
|
|
|
sonorants).
|
|
|
|
The sonorant consonants Ww, YY, RR, and LL are similar to vowels and re-
|
|
quire the same set of control parameters to be varied in order to differentiate
|
|
among them. Formant values given in Table 11-2 for the prevocalic sonorants RR
|
|
and LL depend somewhat on the following vowel. The source amplitude, AV, for
|
|
a prevocalic sonorant should be about 10 dB less than in the vowel. The sonorant
|
|
HH can be synthesized by taking formant frequency and bandwidth parameters
|
|
from the following vowel, increasing the first formant bandwidth to about 300 Hz,
|
|
and replacing voicing by aspiration.
|
|
|
|
The fricatives characterized in Table 11-2 include both voiceless fricatives
|
|
(AF=60, AV=0, AVS=0) and voiced fricatives (AF=50, AV=47, AVS=47). For-
|
|
mants to be excited by the frication noise source are determined by the amplitude
|
|
controls A2, A3, A4, A5, A6, and AB. The amplitude of the parallel second for-
|
|
mant, A2, is zero for all of these consonants before front vowels, but the second
|
|
formant is a front cavity resonance for velars before nonfront vowels and A2 is
|
|
excited. The values given for F2 and F3 are not only valid during the fricative, but
|
|
also can serve as “loci” for the characterization of the consonant-vowel formant
|
|
transitions before front vowels. These are virtual loci in that formant frequency
|
|
values observed at the onset of glottal excitation are somewhere between the locus
|
|
and the vowel target frequency -- the amount of virtual transition being dependent
|
|
on formant-cavity affiliations.
|
|
|
|
The specification of frication spectra in the table is accurate only before front
|
|
vowels in the speech of talker DHK. Before back and rounded vowels, systematic
|
|
changes are observed to the fricative spectra because of anticipatory coarticulation.
|
|
|
|
118
|