You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
43 lines
2.1 KiB
43 lines
2.1 KiB
8
|
|
|
|
The phonological component
|
|
|
|
8.1 Overview
|
|
|
|
The phonological component PHONOL accepts input from the text analysis
|
|
routines (described in Chapters 2-6) and produces an output that is sent to the
|
|
prosodic component PROSOD (to be described in Chapter 9). PHONOL is
|
|
divided into two modules PHONO1 and PHONO2. PHONOI1 uses information
|
|
from the PARSER to specify the syntactic markers that influence the spoken out-
|
|
put. PHONO2 contains a set of segmental recoding rules that are activated to
|
|
select an appropriate allophone for each phoneme, and to simplify certain un-
|
|
stressed phonetic sequences. Rules for pausing are included in both PHONOI1 and
|
|
PHONOZ2. Pauses of various durations are inserted at sentence boundaries, clause
|
|
boundaries, and locations in the text of certain punctuation marks such as commas.
|
|
Some additional pauses are introduced in longer phrases and slow speaking rate so
|
|
that the talker does not seem to have an inhuman supply of breath.
|
|
|
|
8.1.1 Synthesis-by-rule
|
|
|
|
If a subset of the MITalk system is to be used as a speech-synthesis-by-rule
|
|
program by deleting the analysis modules, the preferred first module would be
|
|
PHONO2 or PROSOD. The input to the system would then be an abstract
|
|
representation containing phonemes, lexical stress symbols, and syntactic structure
|
|
symbols for each sentence to be synthesized. Applications for this mode of speech
|
|
generation by computer include cases where an abstract syntactic and phonemic
|
|
representation for each sentence is known or can be computed. Speech quality will
|
|
be better than in the text-to-speech case because analysis errors can be avoided, but
|
|
considerable linguistic sophistication is required of users. Storage requirements
|
|
for sentences are minimal -- on the order of 100 bits per sentence.
|
|
|
|
8.2 Input representation for a sentence
|
|
|
|
The input to PHONOI consists of a phonemic pronunciation for each word (i.e. as
|
|
spoken in isolation), lexical stress pattern, and syntactic information concerning
|
|
part of speech and phrasal structure. The output from PHONO1 consists of a
|
|
|
|
single string of symbols for each sentence.
|
|
The symbol inventory used in PHONO1 and PHONO?2 is shown in Table 8-1,
|
|
|
|
81
|