You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
48 lines
2.7 KiB
48 lines
2.7 KiB
From text to speech: The MITalk system
|
|
|
|
12.1.16 Pitch-synchronous updating of voicing source amplitudes
|
|
|
|
The voicing source amplitude controls AV and AVS only have an effect on the
|
|
synthetic waveform when a glottal impulse is issued. The reason for adjusting
|
|
voicing amplitudes discontinuously at the onset of each glottal period is to prevent
|
|
the creation of pops and clicks due to waveform discontinuities introduced by the
|
|
sudden change in an amplitude control in the middle of a voicing period.
|
|
|
|
12.1.17 Generation of plosive bursts with a predictable spectrum
|
|
|
|
The noise amplitudes AF and AH are used to interpolate the intensity of the noise
|
|
sources linearly over the 5 msec (50 sample) interval. (Thus there is a 5 msec
|
|
delay in the attainment of a new amplitude value for a noise source.) Interpolation
|
|
permits a more gradual onset for a fricative or HH than would otherwise be pos-
|
|
sible. There is, however, one exception to this internal control strategy. A plosive
|
|
burst involves a more rapid source onset than can be achieved by 5 msec linear
|
|
interpolation. Therefore, if AF increases by more than 50 dB from its value
|
|
specified in the previous 5 msec segment, AF is (automatically) changed instan-
|
|
taneously to its new target value. The pseudo-random number generator is also
|
|
reset at the time of plosive burst onset so as to produce exactly the same source
|
|
waveform for each burst. The value to which it is set was chosen so as to produce
|
|
as a burst spectrum as flat as possible.
|
|
|
|
12.1.18 Control of fundamental frequency
|
|
|
|
At times, it is desired to specify precisely the timing of the first glottal pulse
|
|
(voicing onset) relative to a plosive burst. For example, in the syllable pa, it might
|
|
be desired to produce a 5 msec burst of frication noise, 40 msec of aspiration noise,
|
|
and voicing onset exactly 45 msec from the onset of the burst. Usually, a glottal
|
|
pulse is issued in the synthesizer at a time specified by the reciprocal of the value
|
|
of the fundamental frequency control parameter extant when the last glottal pulse
|
|
was issued. However, if either AV or FO is set to zero, no glottal pulse is issued
|
|
during this 5 msec time interval; in fact, no glottal pulses are issued until precisely
|
|
the moment that both the AV and FO control parameters become nonzero. In the
|
|
case of the pa example above, both AV and FO would normally be set to zero
|
|
during the closure interval, burst, and aspiration phase; and AV would be set to
|
|
about 60 dB and FO to about 130 Hz at exactly 45 msec after the synthetic burst
|
|
onset.
|
|
|
|
Since the update interval in the synthesizer is set to 5 msec, voice onset time
|
|
can be specified exactly in 5 msec steps. If greater precision is needed, it is neces-
|
|
sary to change the parameter update interval from 5 msec (NWS=50) to, for ex-
|
|
ample, 2 msec (NWS=20).
|
|
|
|
138
|