You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
54 lines
2.3 KiB
54 lines
2.3 KiB
Introduction
|
|
|
|
95 percent of the input text (consisting of high-frequency, foreign, and polysyllabic
|
|
words) can be transcribed to phonetic notation. For rare or new words, plus
|
|
misspellings (e.g. “recieve”), letter-to-phonetic segment rules are used.
|
|
|
|
1.3.1.3 Lexical stress The effects of suffixes, as well as that of compounding, on
|
|
lexical stress are computed, permitting the use of both stress marks in the
|
|
|
|
transcription and changes in vowel color.
|
|
|
|
1.3.1.4 Phonological recoding Once the initial phonetic transcription is ob-
|
|
tained, some recoding is done based on the sentence-level context, including con-
|
|
sonant “flapping”, insertion of glottal stops, and selection of alternate pronuncia-
|
|
tions of “the”.
|
|
|
|
1.3.1.5 Parsing To aid the selection of prosodic correlates, a phrase-level pars-
|
|
ing is performed. Also, a part-of-speech determination for each word is computed
|
|
to provide input for the parser.
|
|
|
|
1.3.1.6 Semantic analysis Only those semantic effects due to particular lexical
|
|
items, such as negatives, are found, but these have important effects on pitch.
|
|
|
|
1.3.2 Synthesis of speech
|
|
|
|
1.3.2.1 Timing Prepausal lengthening, pause duration, and polysyllabic shorten-
|
|
ing are determined, plus the basic duration of each segment and the effect of
|
|
clusters.
|
|
|
|
1.3.2.2 Fundamental frequency A declination line is found, plus pitch rises on
|
|
stressed syllables, continuation rises to signal continued throughout, and a number
|
|
of segmental effects. Contours appropriate to questions are also found.
|
|
|
|
1.3.2.3 Phonetic targets Given the prosodic framework, phonetic target
|
|
parameters are determined for each phonetic segment, utilizing a “context
|
|
window” five segments wide. There are twenty such parameters that vary with
|
|
time.
|
|
|
|
1.3.2.4 Continuation smoothing The target values are smoothed to yield a full
|
|
set of parameters every 5 msec.
|
|
|
|
1.3.2.5 Parameter conversion The phonetic parameters must be converted to
|
|
coefficients that can be used by the digital formant synthesizer.
|
|
|
|
1.3.2.6 Waveform generation The terminal synthesizer utilizes the coefficients
|
|
|
|
(updated every 5 msec) to generate the speech waveform. A special purpose
|
|
hardware synthesizer is used to perform this task in real-time. Speech samples are
|
|
produced at a 10 kHz rate, and then converted to analog form via a D/A converter
|
|
|
|
and low-pass filter.
|
|
|
|
13
|