You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
45 lines
2.3 KiB
45 lines
2.3 KiB
4
|
|
|
|
The phrase-level parser
|
|
|
|
M
|
|
|
|
4.1 Overview
|
|
|
|
The parser for the text-to-speech system is designed to satisfy a unique set of con-
|
|
straints. It must be able to handle arbitrary text quickly, but does not need to
|
|
derive semantic information. Many parsers attempt to build a deep structure parse
|
|
from the input sentence so that semantic information may be derived for such uses
|
|
as question-answering systems. The text-to-speech parser supplies a surface struc-
|
|
ture parse, providing information for algorithms which produce prosodic effects in
|
|
the output speech. In addition, some clause boundaries are set according to rules
|
|
described in Chapter 8. These phrase-level and clause-level structures provide
|
|
much of the syntactic information needed by the present prosodic algorithms.
|
|
|
|
It is well known that parsing systems which parse unrestricted text often
|
|
produce numerous ambiguous or failed parses. Although it is always possible to
|
|
choose arbitrarily among ambiguous parsings, a failed parse is unacceptable in the
|
|
text-to-speech system. When one examines ambiguous results from full sentence-
|
|
level parsers, one finds that the bottom level of nodes (i.e. the phrase nodes) are
|
|
often invariant among the competing interpretations; the ambiguities arise from
|
|
possible groupings of these nodes at the clause level, especially for parsers which
|
|
build binary trees. One also finds that for many failed parses, much of the struc-
|
|
ture at the phrase level has been correctly determined. The phrase-level parser
|
|
takes advantage of this reliability, producing as many phrase nodes as possible for
|
|
use by the MITalk prosodic component.
|
|
|
|
The phrase-level parser uses comparatively few resources and runs in real-
|
|
time. This is quite unusual for parsers which handle unrestricted text, but is neces-
|
|
sary for a text-to-speech system. It would not be possible in such a practical sys-
|
|
tem to allocate the resources needed for recursion in the grammar and for back-
|
|
tracking control structures. Since extensive backtracking occurs above the phrase
|
|
level for the most part, the combinatorial explosion associated with this strategy is
|
|
|
|
avoided by restriction to phrase-level parsing.
|
|
Phrase recognition is accomplished via an ATN (augmented transition
|
|
|
|
network) interpreter (Woods, 1970) and the grammars for noun groups and verb
|
|
groups. A “noun group” (NGR), as used in this grammar, means either a pronoun
|
|
|
|
40
|