You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

43 lines
2.9 KiB

From text to speech: The MITalk system
provides for important pronunciation effects due to morph structure, and sets an
appropriate basis for the formulation of a well-motivated set of letter-to-sound
rules devoid of ad hoc exceptions.
So far, we have shown how the use of a morph lexicon and accompanying
morph analysis procedures provides a sound solution to the accurate translation of
English word letter strings to sequences of phonetic segment labels. It is important
to realize, however, that morphs are just the surface realization of underlying mor-
phemes, and the distinction between these two units must be maintained. Mor-
phemes are abstract units, and they exist only for .purposes of grammatical or dis-
tributional equivalence. Their use recognizes that words have internal structure,
and that the components of this structure are the constituent morphemes of the
word. Historically, morphemes were introduced to define phonological units where
segmentability was possible, as in the sequence tall, taller, tallest. But there is
nothing in the definition of a morpheme to imply that it must always be an identifi-
able segment of the word of which it is a constituent. The morpheme is not a seg-
ment of a word, and it has no position in a word. It is an abstract unit arising from
linguistic distributional analysis. This can be seen clearly by comparing the words
went and walked. In the latter word, it is easy to see that there are two constituent
morphs, walk and ed, which are in one-to-one correspondence with the underlying
abstract morphemes walk and PAST. But in the case of went, the underlying mor-
phemic analysis provides the two morphemes go and PAST, and it is impossible to
map these in any nonarbitrary way onto the surface letter string went. When seg-
mentation is possible, as is often the case, then morphs can be identified, and
MITalk exploits this fact. For the cases where a root is given a grammatical inflec-
tion, as in went, MITalk provides a special morph type, STRONG, that indicates
the presence of the two underlying morphemes. Clearly went must go in the morph
lexicon, as it is an exception to the normal processes of affixation and compound-
ing. Additionally, the morpheme PLURAL provides ample evidence of the many
ways in which it may be realized on the surface. We- have the pairs boy/boys,
thief/thieves, child/children, tooth/teeth, and fish/fish, as well as many borrowed
pairs from other languages such as concerto/concerti, datum/data, index/indices,
and alummis/alumni. These irregular plurals must be placed in the lexicon, since
MITalk can only deal with morphs that can be found through detection of the
regular and productive word formation processes that are susceptible to segmen-
tation. Many of the analysis procedures of MITalk are based on the underlying
morphemic constituency of a letter string, although only morphs can be exhibited
as letter strings or can occur in the lexicon.
26