Word Sense Disambiguation: Combining Knowledge Sources for Sense Resolution

The word 'bat' can denote a nocturnal animal, a sports apparatus, the blink of an eye, and other interpretations. Humans seem to effortlessly select the appropriate meaning when hearing such an ambiguous word. But computer applications notoriously fail more often than succeed in performing what is known as Word Sense Disambiguation (WSD).

Increasingly familiar examples are found in keyword searches over the internet and in translation between human languages. However, the problem of WSD is not new. It has been intensively studied in the last decade within academic papers and meetings, but no dedicated books such as this one has ever been devoted to the subject.

This book provides descriptions of novel research as well an overview of the field of WSD, with accounts of previous approaches and methodological issues. The work described here is in fact closely related to the field of lexicography—the process of creating dictionaries.

The author presents a description and an evaluation of a practical computer system that has been found to produce extremely accurate word-disambiguation decisions in English. Among the information utilized by this system is the grammatical behaviors of words, the topics of the texts in which they are used, and definitions found in the dictionary. A central thesis of this book is that, while the combination of these knowledge sources is more effective than any one used alone, these sources are, to some degree, independent.

Foreword xi
Preface xv
1 Introduction 1

1.1 The Problem of Word Sense Disambiguation 1
1.2 The Usefulness of WSD 2
1.3 Levels of WSD Tasks 4
1.4 A Guide for the Reader 6

2 Background 9

2.1 Introduction 9
2.2 Early Doubts 9
2.3 Early Approaches 10
2.4 Dictionary Based Approaches 12

2.4.1 Dictionary Definition Overlap 12
2.4.2 Combining Knowledge 14

2.5 Data Based Approaches 16

2.5.1 Word Frequency and Polysemy 16
2.5.2 Supervised and Unsupervised Machine Learning 16
2.5.3 Learning the Best Disambiguation Cue 17
2.5.4 Tagging with Thesaurus Categories 18
2.5.5 Two Claims about Sense Distribution 19
2.5.6 Clustering Usages 20
2.5.7 Exemplar-Based Learning 21

2.6 A Taxonomy of WSD Algorithms 23
2.7 Comparing Approaches 24
2.8 Conclusion 26

3 Meaning and the Lexicon 29

3.1 Introduction 29
3.2 Levels of Meaning Distinction 29
3.3 The Development of Lexicons 31

3.3.1 Johnson and the first Dictionaries 31
3.3.2 Modern Dictionaries 32
3.3.3 Machine Readable Dictionaries 33
3.3.4 Desirable Properties 34

3.4 Some NLP Lexicons 35

3.4.1 LDOCE 35
3.4.2 Roget's Thesaurus 38
3.4.3 WordNet 39
3.4.4 Comparing Lexicons 40

3.5 Meaning and Synonymy 43
3.6 The Lexicographic Process 45
3.7 Examining the Dictionary 48

3.7.1 Analysing the Lexicographic Process 48
3.7.2 Expert and Non-expert Taggers 50
3.7.3 Similarities between Lexicography and Tagging 50

3.8 Could the Lexicographic Process be Automated? 51
3.9 Criticisms of the Dictionary as a Resource 52

3.9.1 The Bank Model 53
3.9.2 The Sense Enumerated Lexicon 56

3.9.2.1 Limitations of SEL 56
3.9.2.2 An Alternative: The Generative Lexicon 59

3.9.3 Abstract and Concrete Lexicons 60

3.10 The Value of the MRD Program 62
3.11 Adapting the Lexicon 63

3.11.1 Clustering 63
3.11.2 Lexical Tuning 64
3.11.3 Lexical Mappings 66

3.12 Summary 67

4 A Framework for Disambiguation 69

4.1 Knowledge Sources for WSD 69
4.2 Relation to WSD 70

4.2.1 Relation between Knowledge Sources 71

4.3 Combining Weak Knowledge Sources 71
4.4 The Need for Structure 72
4.5 A Framework for WSD 73

4.5.1 Lexical Resources 73
4.5.2 Disambiguation Modules 74
4.5.3 Combining Disambiguation Modules 77
4.5.4 Generality of the Framework Across Tasks 78

4.6 Case Studies 78

4.6.1 LDOCE 78
4.6.2 WordNet 80
4.6.3 Summary of Case Studies 81

4.7 Summary 82

5 Part of Speech and Sense Tagging 83

5.1 Introduction 83
5.2 Part of Speech Tagging and WSD 83
5.3 Experiments using Part of Speech 85

5.3.1 Preliminary Investigation 85
5.3.2 Using a Tagger: A Practical Experiment 87

5.4 Usefulness of these Results 90
5.5 Syntactic Tagging Accuracy 91
5.6 Mapping Part of Speech Tag Sets 92
5.7 Conclusion 93

6 Implementation 95

6.1 Introduction 95
6.2 A Note on Evaluation 95
6.3 Lexical Resource 97
6.4 Overview of Implemented System 97
6.5 Preprocessing 97

6.5.1 Named Entity Identification 99
6.5.2 Shallow Syntactic Analysis 100
6.5.3 Evaluation of the Heuristic Parser 102
6.5.4 Lexical Lookup 105

6.6 The Disambiguation Modules 105

6.6.1 Part of Speech Filter 105
6.6.2 Simulated Annealing 106
6.6.3 Broad Context 107
6.6.4 Selectional Restrictions 108
6.6.5 Collocation Extraction 114

6.7 Combining Modules 114
6.8 Summary 118

7 Sense Tagged Corpora 121

7.1 Evaluating WSD Algorithms 121
7.2 Previous Evaluation Regimes 123

7.2.1 Annotated Corpora 123
7.2.2 The Cost of Corpus Annotation 125
7.2.3 Artificial Corpora 126

7.3 Evaluation Strategy 127
7.4 Mapping Lexical Resources 130
7.5 A Mapping between LDOCE and WordNet 133
7.6 Producing an Evaluation Corpus 136
7.7 Evaluation Corpus Properties 137
7.8 Conclusion 140

8 Evaluation 143

8.1 Introduction 143
8.2 Experimental Setting 143
8.3 Experiment 1: General Evaluation 145

8.3.1 Partial Taggers 147
8.3.2 Part of Speech Filter 149

8.4 Experiment 2: Comparative Analysis 149
8.5 Conclusion 153

9 Conclusion 155

9.1 Future Research 156

References 159
Index 173

7/1/2002

ISBN (Paperback): 1575863901 (9781575863900)
ISBN (Cloth): 1575863898 (9781575863894)

Word Sense Disambiguation

Contents