You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

46 lines
2.5 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

The phrase-level parser
For a last suffix of S, three checks are performed. If the previous morph is a
NOUN, then the part-of-speech set is (NOUN (POSS TR), NOUN .... (CONTR
TR)), where “....” are the features that the previous morph had (e.g. (NUM PL)). If
the next to last morph is a PRN, then the part of speech is PRN with the previous
morphs features and the additional property (CONTR TR). If that morph also has
the property (PRNADJ TR), which includes the pronouns ending in body, one, and
thing, then the part-of-speech set also includes PRN with the prior morphs fea-
tures and the property (CASE POSS), as in anybodys.
The last three cases of the dispatch deal with contractions. If the last morph is
N°T, first the program checks if the previous morph is NEED. The part of speech
of neednt is MOD, and the features are (AUX A) and (NOT TR). If the next to
last morph has the part of speech BE, HAVE, or MOD, the processor just adds the
property (NOT TR). If the last morph is *VE and the previous morph is a modal,
then the part of speech is the same as the previous morph with the additional
property (CONTR TR), as in mustve. Finally, if the last morph is one of the verb
contractions *VE, °D, LL, and RE, the processor checks if the prior morph is the
plural morph S. (The kidsve been busy. The boysll go.) If so, the words part
of speech is NOUN with the features (NUM PL) and (CONTR TR). Otherwise, if
the previous morph is a NOUN or PRN, the property (CONTR TR) is added to the
feature set.
If the last suffix is none of the above, then the part-of-speech set of the word
is the part-of-speech set of that morph. If a word still has no part of speech (e.g.
onlys), then the routine which assigns “default” parts of speech is called, as in the
case of no decomposition.
4.6 The parser algorithm
4.6.1 Parsing strategy
The parser reads information from DECOMP on the words in a text one sentence
at a time. It then attempts to find phrases in the sentence. The operation of the
parsing logic can be thought of as having two levels. The global level reflects the
parsing strategy, which has been found to give the best phrases. It is based on
three empirical facts:
1. There are many more noun groups (and prepositional phrases) than
verb groups in running text.
2. The initial portions of noun groups are easier to detect than verb
groups. Verb groups frequently begin with the verb itself which of-
ten has both NOUN and VERB in its possible part-of-speech set.
45