The lookup Utility and Lookup-Strategy
Scripts
Kenneth R. Beesley
Xerox Research Centre Europe
6, chemin de Maupertuis
38240 Meylan, France
Ken.Beesley@xrce.xerox.com
October 11, 2004
1
Introduction
The Xerox
lookup
command-line utility is pretty well documented in the book
(pp. 431-439). This handout provides a quick review and some mind-tuning.
2
Capabilities
In its simplest mode of operation, the
lookup
utility applies a morphological
analyzer transducer (previously compiled using lexc and/or xfst) to an input file
(which should be a tokenized file, with one word per line) and produces an output
file containing morphological analyses. Various flags control the format of the
output file, and
lookup
can also be directed by a lookup strategy script, which
will be described below.
3
Input
The input to
lookup
is a tokenized file, i.e. a file with one token per line. If you
already have a tokenized file, perhaps with a name like
mywordlist.txt
, you
can "pipe" it directly to
lookup
cat mywordlist.txt | lookup myanalyzerfst | ...
1