|
|
From text to speech: The MITalk system
|
|
|
|
|
|
morph types. The first criterion is functional and divides suffixes into derivational
|
|
|
(“DERIVATIONAL” or “DERIV”) and inflectional (“INFLECTIONAL” or
|
|
|
“INFL”) types. Derivational suffixes have a major effect on the meaning of a root
|
|
|
and may change the part of speech (e.g. ness, ment, y). Inflectional suffixes
|
|
|
merely change the tense, number, or inflection of the root (e.g. ing, ed, s). This
|
|
|
classification is used primarily by the scoring algorithm.
|
|
|
|
|
|
The other suffix classification is used solely by the spelling change rules.
|
|
|
This divides suffixes into vocalic and nonvocalic categories depending on whether
|
|
|
|
|
|
the suffix begins with a vowel or consonant, respectively. The type names are
|
|
|
“VOCALIC” (or “VOC”) and “NONVOCALIC” (or “NONV?”).
|
|
|
|
|
|
The “STRONG” morph type denotes a root which already contains tense or
|
|
|
number information. This type of morph is a combination of root and inflectional
|
|
|
morphemes which are not reflected directly in the morph structure. Examples are
|
|
|
went (20+PAST) and women (woman+PLURAL).
|
|
|
|
|
|
In addition to free roots, there are two types of bound roots. The “LEFT
|
|
|
FUNCTIONAL ROOT” (or “LF-ROOT”) is a root which must always be followed
|
|
|
by a derivational suffix. An example is absorpt in absorptive and absorption. In
|
|
|
this case (as with many LF-ROOTs), the morph represents a suffix-caused spelling
|
|
|
mutation of a root morpheme which is too complex or idiosyncratic for the spelling
|
|
|
change rules to incorporate (e.g. absorb+ive — absorptive). A “RIGHT FUNC-
|
|
|
TIONAL ROOT” (or “RF-ROOT”) must always be preceded by a prefix. For ex-
|
|
|
ample, mit in permit, transmit, and submit. These morphs generally have some
|
|
|
etymological basis (and are not simply repeated letter patterns). For example: the
|
|
|
root mit is derived from the Latin mittere -- to send; it is just that the root itself
|
|
|
never became part of the English language and its meaning is overlooked by the
|
|
|
average speaker.
|
|
|
|
|
|
The hyphen (-) has its own morph type “HYPHEN”. This is provided so that
|
|
|
hyphenated words which do not appear directly in the lexicon can be properly
|
|
|
decomposed.
|
|
|
|
|
|
3.4.3 Legal morph sequences
|
|
|
The detection of legal and illegal morph sequences is performed by a finite state
|
|
|
machine (FSM).
|
|
|
|
|
|
The grammar recognized by the FSM is summarized in production rules
|
|
|
below:1
|
|
|
|
|
|
IThese use Wirth’s notation: = for production, [ ] for optional factors (zero or one rep.), { } for
|
|
|
repeated factors (zero to infinite repetition), () for grouping, and | for alternatives.
|
|
|
|
|
|
30
|