You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
46 lines
2.8 KiB
46 lines
2.8 KiB
From text to speech: The MITalk system
|
|
|
|
prehension is the quality of the input signal expressed in terms of its overall intel-
|
|
ligibility. But as we have seen even from the results summarized in the previous
|
|
sections, additional consideration must also be given to the contribution of higher-
|
|
level sources of knowledge to recognition and comprehension. In this last section,
|
|
we wanted to obtain some preliminary estimate of how well listeners could com-
|
|
prehend continuous text produced by the text-to-speech system. Previous evalua-
|
|
tions of synthetic speech output have been concerned primarily with measuring in-
|
|
telligibility or listener preferences with little if any concern for assessing com-
|
|
prehension or understanding of the content of the materials (Nye et al., 1975).
|
|
Indeed, as far as we have been able to determine, no previous formal tests of the
|
|
comprehension of continuous synthetic speech have ever been carried out with a
|
|
relatively wide range of textual materials specifically designed to assess under-
|
|
standing of the content rather than form of the speech.
|
|
|
|
To accomplish this goal, we selected fifteen narrative passages and an ap-
|
|
propriate set of test questions from several standardized adult reading comprehen-
|
|
sion tests. The passages were quite diverse, covering a wide range of topics, writ-
|
|
ing styles and vocabulary. We thought that a large number of passages would be
|
|
interesting to listen to in the context of tests designed to assess comprehension and
|
|
understanding. Since these test passages were selected from several different types
|
|
of reading tests, they also varied in difficulty and style, permitting us to evaluate
|
|
the contribution of all of the individual modules of the text-to-speech system in
|
|
terms of one relatively gross measure.
|
|
|
|
In addition to securing measures of listening comprehension for these pas-
|
|
sages, we also collected a parallel set of data on reading comprehension of these
|
|
materials from a second group of subjects. The subjects in the reading comprehen-
|
|
sion group answered the same questions after reading each passage silently, as did
|
|
subjects in the listening comprehensiflon group. This condition was included in or-
|
|
der to permit comparison between the two input modalities. It was assumed that
|
|
the results of these comprehension tests would therefore provide an initial, al-
|
|
though preliminary, benchmark against which the entire text-to-speech system
|
|
could be evaluated with materials somewhat comparable to those used in the im-
|
|
mediate future.
|
|
|
|
13.4.1 Method
|
|
|
|
13.4.1.1 Subjects Forty-four additional naive undergraduate students were
|
|
recruited as paid subjects. They were drawn from the same source as the subjects
|
|
used in the previous studies. Some of the subjects assigned to the reading com-
|
|
prehension group had participated in the earlier study using the Modified Rhyme
|
|
|
|
162
|