from-text-to-speech-the-mit.../pages-txt/132.txt

From text to speech: The MITalk system

VTAR3: : DTAR3

Frequency

| DTART

TCDIPH  TCDIPH

0 TDMID INHDUR
Time

Figure 11-7: Constants used to specify the inherent formant and durational
characteristics of a sonorant

In addition to differences in source amplitudes, voiced and voiceless fricatives dif-
fer in that F1 is higher and B1 is larger when the glottis is open.

The affricate parameters in Table 11-2 refer to the fricative portion of the af-
fricate. Similarly, the plosive parameters in Table 11-2 refer to the brief burst of
frication noise generated at plosive release. Formant frequency values again serve
as loci for predicting formant positions at voicing onset.

The parameters that are used to generate a nasal murmur include the nasal
pole and zero frequencies FNP and FNZ. The nasal pole and zero are used
primarily to approximate vowel nasalization at nasal release by splitting F1 into a
pole-zero-pole complex. The details of nasal murmurs that have been described by
Fujimura (1962) are approximated by formant bandwidth adjustments rather than
by the theoretically correct method of pole-zero insertion. The reason is that it is
not possible to simulate both the higher frequency pole-zero details of nasal mur-
murs and vowel nasalization simultaneously without moving the frequency of the
nasal pole and zero very fast at release, which would generate an objectionable
click in the output, and vowel nasalization has been found to be perceptually more
important. A nasalized vowel is generated by increasing F1 by about 100 Hz, and
by setting the frequency of the nasal zero to be the average of this new F1 value
and 270 Hz (the frequency of the fixed nasal pole).

Not included in Tables 11-1 and 11-2 are steady-state target values for un-
stressed allophones, postvocalic allophones, flaps, glottal stops, voicebars, and

120