You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
50 lines
2.0 KiB
50 lines
2.0 KiB
From text to speech: The MITalk system
|
|
|
|
state, a picture of the input stream is shown using the metalanguage of the gram-
|
|
mar above and with “<>" marking the position in the stream represented by the
|
|
state. To the right of the marker is context represented by the state. To the left, is
|
|
an expression representing the expected structure of the remainder of the word.
|
|
|
|
FO word < {INFL {suffix}}
|
|
RO (affixed-word | LF-ROOT) <> DERIV {suffix}
|
|
|
|
R1 (affixed-word | LF-ROOT) <> DERIV effective-root
|
|
|
|
M1 PREFIX <> RF-ROOT ({suffix}
|
|
|
|
L1 {affixed-word | PREFIX | INITIAL} <> effective-root {suffix}
|
|
|
|
LO {affixed-word | PREFIX | INITIAL} <> PREFIX effective-root {suffix}
|
|
I0 {word HYPHEN} <> (ABSOLUTE | INITIAL affixed-word)
|
|
|
|
3.44 Selectional rules and scoring
|
|
|
|
When multiple morph coverings are found, selectional rules are needed to choose
|
|
the covering most likely to be correct. For example, a means of favoring
|
|
form+al+ly (ROOT + DERIV + DERIV) over form+ally (ROOT + ROOQOT) as the
|
|
decomposition of formally is needed. A set of derivational rules was devised by
|
|
examining all of the multiple coverings produced by DECOMP during the
|
|
development of the morph lexicon. The first result of this study was the discovery
|
|
of the so-called “standard form” for a (possibly compound) word stated below as
|
|
two productions:
|
|
|
|
std-root = (ROOT | LF-ROOT DERIYV)
|
|
|
|
std-form = {PREFIX} {std-root} (std-root {DERIV} | STRONG) {INFL}
|
|
Coverings which match this form are to be preferred above all others.
|
|
Among coverings that match the standard form, the following partial order-
|
|
ings were found (“>" means that the pattern on the left is more desirable):
|
|
|
|
ROOT > anything else
|
|
PREFIX+ROOT > ROOT+DERIV > ROOT+INFL > ROOT+ROOT
|
|
PREFIX+PREFIX+ROOT > ROOT+ROOT
|
|
|
|
ROOT+DERIV+DERIV > ROOT+ROOT
|
|
|
|
These rules are implemented by associating a cost with each transition of the
|
|
FSM and keeping track of the total cost of the decomposition as morphs are
|
|
stripped off the word. This cost is the “score value” mentioned above in the algo-
|
|
rithm description. The covering with the lowest total cost is the most desirable.
|
|
|
|
32
|