Finnish Noun Inflection
This exercise is a simplified version of the Finnish
Noun Inflection problem on the Beesley&Karttunen book web site.
The goal is to build
a lexicon that contains Finnish
nouns inflected for number and case. To keep the problem reasonably
small, we consider only three cases: Singular Nominative,
Singular Partitive and Plural Partitive. For the same reason, we only
consider two types of nouns: monosyllabic and some bisyllabic stems.
Even with these limitations, the problem is challenging because the
shape of the endings depends on the shape of the stem and the stems are
subject to several regular alternations. This excercise assumes that
the student is familiar with xfst
replace
rules and the syntax of lexc
source files.
The Facts
A Finnish noun begins with a stem. In all of the cases below, assume
that the stem
is identical with the nominative singular. A plural marker, if any,
immediately follows the stem. After the stem and the possible plural
marker comes one of several possible case endings. We consider only two
cases: Nominative and Partitive. Singular nominative has no case
marker. For the purpose of
this excercise we, assume that the plural and the Partitive endings are:
Table 1: Plural and Partitive Markers
I
|
Plural marker for cases other
than Nominative,
realized as i or j
|
Ta
|
Partitive Marker marker,
realized as ta or a in words containing only back
vowels.
|
The following table illustrates some of the possible combinations with
the noun valo 'light'.
Table 3: Examples illustrating morphotactics with valo 'light'.
| valo |
Nominative Singular. No
case ending |
| valoa |
Partitive Singular. The
partitive marker Ta is
realized here as a |
| valoja |
Partitive Plural. The plural
marker I is realized as j
here. The partitive marker Ta is realized here as a. |
The realization of the T in
the Genitive Plural and the Partitive marker depends on the syllable
structure of the word. After a monosyllabic stem such as maa 'earth', the T is realized as t as in maata. After a bisyllabic stem such
as valo 'light', the T disappears as in valoa.
The plural marker I is
realized as j between two
vowels, otherwise it is realized as i.
With this information you should be able to write the rules that
correctly realize the plural marker and the partitive suffix in all
environments.
The remaining problem is that the stem of the noun also undergoes
alternations. Consonant
Gradation is one that we have already seen and solved. It
does not apply in the forms we are considering. We will use only
back-vowel stems in order to avoid having to deal with vowel harmony.
The other
stem alternations, Vowel
Rounding, Vowel Lowering,
Vowel Dropping, and Vowel Shortening are illustrated
in
Table 4 below.
Table 4: Stem and suffix alternations
Nom Sg
|
Gloss
|
Part Sg
|
Part Pl
|
puu
|
tree
|
puuta
|
puita
|
maa
|
earth
|
maata
|
maita
|
suo
|
swamp
|
suota
|
soita
|
|
|
|
|
tikka
|
dart
|
tikkaa
|
tikkoja
|
pappi
|
priest
|
pappia
|
pappeja
|
kukka
|
flower
|
kukkaa
|
kukkia
|
tutti
|
pacifier
|
tuttia
|
tutteja
|
kauppa
|
shop
|
kauppaa
|
kauppoja
|
kuoppa
|
hole
|
kuoppaa
|
kuoppia
|
|
|
|
|
jalka
|
foot
|
jalkaa
|
jalkoja
|
linko
|
sling
|
linkoa
|
linkoja
|
|
|
|
|
sopu
|
harmony
|
sopua
|
sopuja
|
kampa
|
comb
|
kampaa
|
kampoja
|
piispa
|
bishop
|
piispaa
|
piispoja
|
|
|
|
|
vahti
|
guard
|
vahtia
|
vahteja
|
ilta
|
evening
|
iltaa
|
iltoja
|
sota
|
war
|
sotaa
|
sotia
|
Vowel Rounding
Short a is rounded to an o in front of the plural marker I.
Examples: tikkaa Partitive
Singular, tikkoja Partitive
Plural, kamman Genitive
Singular, kampojen Genitive
Plural, kauppaa Partitive
Singular, kauppoja Partitive
Plural. This does
not happen if the vowel nucleus of the preceding syllable starts with a
rounded vowel (o or u). See the rule
for Vowel Dropping.
Vowel Lowering
Short i is lowered to e in front of the plural marker I. Examples: vahtia Partitive Singular, vahteja Partitive Plural, pappia Partitive Singular, pappeja Partive Plural. See the
rule for Vowel Dropping.
Vowel Dropping
A short a is deleted in front
of the plural marker I if the
nucleus of the preciding syllable consists of, or begins with, a
rounded vowel (u or o). Note the different behavior of kuoppa where the a is dropped and kauppa where the a is rounded to o in the plural.
Examples: sotaa
Partitive Singular, sotia
Partitive Plural, kuoppaa
Partitive Singular, kuoppia
Partitive Plural.
Vowel Shortening
In front of the plural marker I,
the long vowels aa,
ee, ii, oo, uu, are shortened
to
a,
e, i, o, u, respectively. The
diphthongs uo and ie shorten to o
and e, respectively.
Examples: puuta Partitive
Singular, puita Partitive
Plural, suota Partitive
Singular, soita Partitive
Plural.
The Task
Your task is to write an xfst script
that reads the finnish.lexc source file provided below that
includes the words mentioned above and assembles them into
morphotactically correct underlying Finnish forms. We use the tags +Sg, +Pl, +Nom, +Part for
marking number and case on the lexical side. Compile finnish.lexc using the command read lex in xfst. At this point your network
should contain pairs such as
kukka+Sg+Part
jalka+Pl+Part suo+Pl+Part
kukka
Ta jalka I Ta suo
I Ta
Secondly, write replace rules for
realizing the plural marker, the partitive
endings, and the four stem-changing rules sketched above. Combine
the suffix realization rules
with the stem alternation rules sketched above.
Think about how to
order the rules. It matters.
Finally, create an xfst script
that reads in the lexicon, compiles the rules, and composes the lexicon
with the rules leaving the result on the stack. The final result
should contain pairs such as
kukka+Sg+Part
jalka+Pl+Part suo+Pl+Part
kukka
a jalko j a
so i ta
etc.
Verfify that the lower side of the
lexicon contains the properly
inflected surface forms by terminating the script with the command
print lower-words
Here is the lexicon:
# -*- coding: iso-8859-1 -*-
# finnish.lexc
Multichar_Symbols
+Sg +Pl +Nom +Part
LEXICON Root
Nouns;
LEXICON Nouns
puu
Number; ! tree
maa
Number; ! earth
suo
Number; ! swamp
tikka
Number; ! woodpecker
pappi
Number; ! priest
kukka
Number; ! flower
tutti
Number; ! pacifier
kauppa
Number; ! shop
kuoppa
Number; ! hole
jalka
Number; ! foot
linko
Number; ! sling
sopu
Number; ! harmony
kampa
Number; ! comb
piispa
Number; ! bishop
vahti
Number; ! guard
ilta
Number; ! evening
sota
Number; ! war
Lexicon Number
+Sg+Nom:0
#;
+Sg:0
Case;
+Pl:I
Case;
Lexicon Case
+Part:Ta #;