CSLI Publications logo
new books
catalog
series
knuth books
contact
for authors
order
search
CSLI Publications
Facebook CSLI Publications RSS feed
CSLI Publications Newsletter Signup Button
 
Putting Linguistics into Speech Recognition cover

Putting Linguistics into Speech Recognition

Manny Rayner, Beth Ann Hockey, and Pierrette Bouillon

High-performance spoken dialogue interfaces typically use a spoken command grammar, which defines what the user can say when talking to the system. For complex applications, implementation and maintenance of this grammar is a major task requiring specialized expertise in computational linguistics and software engineering. The resulting grammars are difficult to maintain, and do not port easily to new domains.

Regulus is an Open Source toolkit for construction of spoken command grammars, which has been developed since 2001 by a consortium whose main partners have been NASA Ames, the University of Geneva, and Fluency Voice Technology. Grammar development with Regulus is carried out using example-based methods and reusable grammar resources, which reduces the level of expertise needed and makes the process more automated. The Regulus approach is effective for building command grammars even at initial stages of a project when there may be little or no domain data available. Regulus has been used to build command grammars for several major projects. Among these are NASA's Clarissa, which in 2005 became the first spoken dialogue system to be deployed in space, and MedSLT, an Open Source medical speech translator developed at Geneva University.

This book presents a complete description of both the practical and theoretical aspects of Regulus, including several example applications which can be downloaded from the companion website.

Manny Rayner is a research scientist at NASA Ames Research Center, California, and Geneva University, Switzerland. Beth Ann Hockey is a research scientist at Ames Research Center. Pierrette Bouillon is project lead on MedSLT medical spoken language translation project translation project at Geneva University.

Contents

  • Foreword xi
  • 1 Introduction 1
    • 1.1 What this book is about 1
    • 1.2 Speech recognition and language models 5
    • 1.3 What Regulus does 13
    • 1.4 Clarissa and MedSLT 15
    • 1.5 Related work 20
    • 1.6 Plan of the book 20
    • 1.7 Summary 21

    I Using Regulus 23

    • 2 Getting started 25
      • 2.1 Getting set up 25
      • 2.2 A toy grammar in GSL 28
      • 2.3 Rewriting Toy0 in Regulus 32
      • 2.4 Regulus configuration files 37
      • 2.5 Using Regulus 39
      • 2.6 Summary 40
    • 3 Simple applications 43
      • 3.1 Introduction 43
      • 3.2 The Regulus Speech Server 44
      • 3.3 A toy dialogue system in Prolog 46
      • 3.4 A toy speech translation system in Prolog 50
      • 3.5 A toy dialogue system in Java 53
      • 3.6 Summary 62
    • 4 Developing grammars 65
      • 4.1 Introduction 65
      • 4.2 Using the Regulus development environment 65
      • 4.3 The Toy1 example grammar 67
      • 4.4 Unification 77
      • 4.5 Macros 81
      • 4.6 Compiling the Toy1 recogniser 85
      • 4.7 Systematic testing of recognisers 87
      • 4.8 Summary 89
    • 5 A spoken dialogue system 93
      • 5.1 Introduction 93
      • 5.2 The Toy1 spoken dialogue system 95
      • 5.3 The input manager 102
      • 5.4 The dialogue manager 104
      • 5.5 The output manager 108
      • 5.6 Integrating dialogue management with recognition 108
      • 5.7 Dealing with ellipsis and corrections 112
      • 5.8 Summary 117
    • 6 A speech translation system 119
      • 6.1 Introduction 119
      • 6.2 Transfer-based systems 120
      • 6.3 Developing translation applications 127
      • 6.4 Translation through interlingua 132
      • 6.5 Translation of ellipsis 134
      • 6.6 Systematic development 138
      • 6.7 Integrating translation with recognition 142
      • 6.8 Summary 145
    • 7 Using grammar specialisation 149
      • 7.1 Overview 149
      • 7.2 Using the general English grammar 150
      • 7.3 The training corpus 154
      • 7.4 Adding lexical entries 156
      • 7.5 General grammar semantics 166
      • 7.6 Multiple top-level specialised grammars 169
      • 7.7 Including lexicon entries directly 169
      • 7.8 Dealing with ambiguity 171
      • 7.9 Making compilation more efficient 171
      • 7.10 Using probabilistic tuning 172
      • 7.11 Summary 173

    II How Regulus Works 175

    • 8 Compiling feature grammars into CFG 177
      • 8.1 Introduction 177
      • 8.2 Exhaustive expansion 178
      • 8.3 Filtering 179
      • 8.4 Efficient filtering of CFGs 182
      • 8.5 Interleaving expansion and filtering 186
      • 8.6 Pre-processing of feature grammars 195
      • 8.7 Transforming the output CFG 199
      • 8.8 Semantics 203
      • 8.9 Summary 203
    • 9 A general English feature grammar for speech 205
      • 9.1 Introduction 205
      • 9.2 What makes speech grammars special 206
      • 9.3 English grammar: basic intuitions 206
      • 9.4 Compositional semantics 209
      • 9.5 Noun phrases 211
      • 9.6 Verb phrases and basic clauses 214
      • 9.7 Adjuncts 228
      • 9.8 Coordination 229
      • 9.9 Feature defaults 230
      • 9.10 Summary 231
    • 10 Grammar specialisation using Explanation Based Learning 233
      • 10.1 Explanation Based Learning 233
      • 10.2 Defining cutting-up criteria 244
      • 10.3 Different kinds of cutting-up criteria 246
      • 10.4 Summary 251
    • 11 Performance of grammar-based recognisers 255
      • 11.1 Introduction 255
      • 11.2 Varying vocabulary size 256
      • 11.3 Varying linguistic coverage
      • 11.4 Varying the feature set 261
      • 11.5 Varying the cutting-up criteria 263
      • 11.6 Comparing CFG and PCFG language models 266
      • 11.7 Deriving recognisers from general grammars 267
      • 11.8 Summary 268
    • 12 Comparison of rule-based and robust approaches 271
      • 12.1 Introduction 271
      • 12.2 Methodological issues 272
      • 12.3 Experiments on MedSLT 279
      • 12.4 Experiments on Clarissa 281
      • 12.5 Discussion 282
      • 12.6 Summary 286
    • 13 Summary and future directions 289
      • 13.1 Summary 289
      • 13.2 Future directions 291
  • Appendix: Online Documentation 293
  • References 295
  • Index 301

4/1/2006

ISBN (Paperback): 1575865262 (9781575865263)
ISBN (Cloth): 1575865254 (9781575865256)
ISBN (Electronic): 1575868555 (9781575868554)

Add to Cart
View Cart

Check Out

Distributed by the
University of
Chicago Press

pubs @ csli.stanford.edu