|
|
The phrase-level parser
|
|
|
|
|
|
For a last suffix of ’S, three checks are performed. If the previous morph is a
|
|
|
NOUN, then the part-of-speech set is (NOUN (POSS TR), NOUN .... (CONTR
|
|
|
TR)), where “....” are the features that the previous morph had (e.g. (NUM PL)). If
|
|
|
the next to last morph is a PRN, then the part of speech is PRN with the previous
|
|
|
morph’s features and the additional property (CONTR TR). If that morph also has
|
|
|
the property (PRNADJ TR), which includes the pronouns ending in body, one, and
|
|
|
thing, then the part-of-speech set also includes PRN with the prior morph’s fea-
|
|
|
tures and the property (CASE POSS), as in anybody’s.
|
|
|
|
|
|
The last three cases of the dispatch deal with contractions. If the last morph is
|
|
|
N°T, first the program checks if the previous morph is NEED. The part of speech
|
|
|
of needn’t is MOD, and the features are (AUX A) and (NOT TR). If the next to
|
|
|
last morph has the part of speech BE, HAVE, or MOD, the processor just adds the
|
|
|
property (NOT TR). If the last morph is *VE and the previous morph is a modal,
|
|
|
then the part of speech is the same as the previous morph with the additional
|
|
|
property (CONTR TR), as in must’ve. Finally, if the last morph is one of the verb
|
|
|
contractions *VE, °D, ’LL, and ’RE, the processor checks if the prior morph is the
|
|
|
plural morph S. (The kids’ve been busy. The boys’ll go.) If so, the word’s part
|
|
|
of speech is NOUN with the features (NUM PL) and (CONTR TR). Otherwise, if
|
|
|
the previous morph is a NOUN or PRN, the property (CONTR TR) is added to the
|
|
|
feature set.
|
|
|
|
|
|
If the last suffix is none of the above, then the part-of-speech set of the word
|
|
|
is the part-of-speech set of that morph. If a word still has no part of speech (e.g.
|
|
|
only’s), then the routine which assigns “default” parts of speech is called, as in the
|
|
|
case of no decomposition.
|
|
|
|
|
|
4.6 The parser algorithm
|
|
|
|
|
|
4.6.1 Parsing strategy
|
|
|
The parser reads information from DECOMP on the words in a text one sentence
|
|
|
at a time. It then attempts to find phrases in the sentence. The operation of the
|
|
|
parsing logic can be thought of as having two levels. The global level reflects the
|
|
|
parsing strategy, which has been found to give the best phrases. It is based on
|
|
|
three empirical facts:
|
|
|
1. There are many more noun groups (and prepositional phrases) than
|
|
|
verb groups in running text.
|
|
|
2. The initial portions of noun groups are easier to detect than verb
|
|
|
groups. Verb groups frequently begin with the verb itself which of-
|
|
|
|
|
|
ten has both NOUN and VERB in its possible part-of-speech set.
|
|
|
|
|
|
45
|