You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
43 lines
2.9 KiB
43 lines
2.9 KiB
From text to speech: The MITalk system
|
|
|
|
provides for important pronunciation effects due to morph structure, and sets an
|
|
appropriate basis for the formulation of a well-motivated set of letter-to-sound
|
|
rules devoid of ad hoc exceptions.
|
|
|
|
So far, we have shown how the use of a morph lexicon and accompanying
|
|
morph analysis procedures provides a sound solution to the accurate translation of
|
|
English word letter strings to sequences of phonetic segment labels. It is important
|
|
to realize, however, that morphs are just the surface realization of underlying mor-
|
|
phemes, and the distinction between these two units must be maintained. Mor-
|
|
phemes are abstract units, and they exist only for .purposes of grammatical or dis-
|
|
tributional equivalence. Their use recognizes that words have internal structure,
|
|
and that the components of this structure are the constituent morphemes of the
|
|
word. Historically, morphemes were introduced to define phonological units where
|
|
segmentability was possible, as in the sequence tall, taller, tallest. But there is
|
|
nothing in the definition of a morpheme to imply that it must always be an identifi-
|
|
able segment of the word of which it is a constituent. The morpheme is not a seg-
|
|
ment of a word, and it has no position in a word. It is an abstract unit arising from
|
|
linguistic distributional analysis. This can be seen clearly by comparing the words
|
|
went and walked. In the latter word, it is easy to see that there are two constituent
|
|
morphs, walk and ed, which are in one-to-one correspondence with the underlying
|
|
abstract morphemes walk and PAST. But in the case of went, the underlying mor-
|
|
phemic analysis provides the two morphemes go and PAST, and it is impossible to
|
|
map these in any nonarbitrary way onto the surface letter string went. When seg-
|
|
mentation is possible, as is often the case, then morphs can be identified, and
|
|
MITalk exploits this fact. For the cases where a root is given a grammatical inflec-
|
|
tion, as in went, MITalk provides a special morph type, STRONG, that indicates
|
|
the presence of the two underlying morphemes. Clearly went must go in the morph
|
|
lexicon, as it is an exception to the normal processes of affixation and compound-
|
|
ing. Additionally, the morpheme PLURAL provides ample evidence of the many
|
|
ways in which it may be realized on the surface. We- have the pairs boy/boys,
|
|
thief/thieves, child/children, tooth/teeth, and fish/fish, as well as many borrowed
|
|
pairs from other languages such as concerto/concerti, datum/data, index/indices,
|
|
and alummis/alumni. These irregular plurals must be placed in the lexicon, since
|
|
MITalk can only deal with morphs that can be found through detection of the
|
|
regular and productive word formation processes that are susceptible to segmen-
|
|
tation. Many of the analysis procedures of MITalk are based on the underlying
|
|
morphemic constituency of a letter string, although only morphs can be exhibited
|
|
as letter strings or can occur in the lexicon.
|
|
|
|
26
|