You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

49 lines
2.5 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

From text to speech: The MITalk system
morph types. The first criterion is functional and divides suffixes into derivational
(“DERIVATIONAL” or “DERIV”) and inflectional (“INFLECTIONAL” or
“INFL”) types. Derivational suffixes have a major effect on the meaning of a root
and may change the part of speech (e.g. ness, ment, y). Inflectional suffixes
merely change the tense, number, or inflection of the root (e.g. ing, ed, s). This
classification is used primarily by the scoring algorithm.
The other suffix classification is used solely by the spelling change rules.
This divides suffixes into vocalic and nonvocalic categories depending on whether
the suffix begins with a vowel or consonant, respectively. The type names are
“VOCALIC” (or “VOC”) and “NONVOCALIC” (or “NONV?”).
The “STRONG” morph type denotes a root which already contains tense or
number information. This type of morph is a combination of root and inflectional
morphemes which are not reflected directly in the morph structure. Examples are
went (20+PAST) and women (woman+PLURAL).
In addition to free roots, there are two types of bound roots. The “LEFT
FUNCTIONAL ROOT” (or “LF-ROOT”) is a root which must always be followed
by a derivational suffix. An example is absorpt in absorptive and absorption. In
this case (as with many LF-ROOTs), the morph represents a suffix-caused spelling
mutation of a root morpheme which is too complex or idiosyncratic for the spelling
change rules to incorporate (e.g. absorb+ive — absorptive). A “RIGHT FUNC-
TIONAL ROOT” (or “RF-ROOT”) must always be preceded by a prefix. For ex-
ample, mit in permit, transmit, and submit. These morphs generally have some
etymological basis (and are not simply repeated letter patterns). For example: the
root mit is derived from the Latin mittere -- to send; it is just that the root itself
never became part of the English language and its meaning is overlooked by the
average speaker.
The hyphen (-) has its own morph type “HYPHEN”. This is provided so that
hyphenated words which do not appear directly in the lexicon can be properly
decomposed.
3.4.3 Legal morph sequences
The detection of legal and illegal morph sequences is performed by a finite state
machine (FSM).
The grammar recognized by the FSM is summarized in production rules
below:1
IThese use Wirths notation: = for production, [ ] for optional factors (zero or one rep.), { } for
repeated factors (zero to infinite repetition), () for grouping, and | for alternatives.
30