You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

52 lines
2.3 KiB

Morphological analysis
find a set of possible spelling changes! at the right end of the
remainder,
attempt a recursive decomposition for each spelling variation,
save the results of the best-scoring of these variations,
restore the remainder string, state, and score to their original
values.
ENDIF,
find the next longest morph which matches the right end of the string.
END WHILE.
The decision to search from the right end of the word was made early in the
development of the system before the selectional rules were implemented. It was
observed that the best decomposition was found first by stripping off suffixes be-
fore searching for roots and prefixes. When a later algorithm was developed in
which all decompositions were found and a choice made, the strategy was retained.
Since only the decomposition with the best score is kept while searching for other
possible morph coverings, finding the best decomposition early in the search is still
more efficient; potential coverings with worse scores can be discarded as early as
possible.
3.4.2 Morph types
Not all sequences of morphs are legal in the English language. For this reason
(and later, for scoring multiple coverings) each morph in the lexicon has a type
code. These morph type codes refine the coarse categories of “prefix”, “suffix”,
and “root” to obtain better performance in finding the correct covering.
The morph type “FREE ROOT” (or simply “ROOT”) denotes a word which
can appear alone or with suffixes, prefixes, and/or other ROOTs. Typical ROOTs
are: side, cover, and spell. The type “ABSOLUTE” is assigned to words which do
not allow most affixes (suffixes or prefixes). These are words such as the, into, of,
and proper names. (The few affixes permitted are the inflectional suffixes such as
plural and possessive forms.) This type is essential in preventing DECOMP from
attempting to match the morphs a and I in many words.
Most prefixes have the type “PREFIX” that denotes a prefix which can com-
bine with roots and other prefixes. Examples are: pre, dis, and mis. The remain-
ing prefixes can only occur at the beginning of a word and are classified as
“INITIAL”. Examples are meta and centi.
Suffixes are classified using two different criteria yielding a total of four
INote that unchanged spelling is always one of these possibilities.
29