COMPUTATIONAL MODELS OF TEXTUAL INFERENCE

Bill MacCartney

Sponsored by the Stanford Humanities Center/Mellon Foundation
Graduate Research Program



In the last few years, the NLP community has seen a surge in work aiming to provide robust, arbitrary-domain textual inference -- that is, the ability to determine whether one piece of text can reasonably be inferred from another. Substantial progress on this task is key to many natural language applications, such as question answering and semantic retrieval. The PASCAL Recognizing Textual Entailment (RTE) Challenge is one attempt at a concrete formulation of the problem, containing examples such as

  • Text: Wal-Mart defended itself in court today against claims that its female employees were kept out of jobs in management because they are women.
  • Hypothesis: Wal-Mart was sued for sexual discrimination.
  • Answer: entailed


  • In this talk I will first discuss high-level characteristics of the RTE data sets: what kinds of inferences are emphasized, what kinds are not, and whether the RTE problem formulation is appropriate to the broader goal of developing useful textual inference systems.

    Next, I'll discuss various computational approaches to the textual inference problem, including semantic overlap models, logical approaches, and graph-matching techniques. I'll argue that, while graph-matching is the most promising of these avenues, it suffers from significant shortcomings, including flawed assumptions of locality and monotonicity.

    Finally, I'll describe efforts underway in the Stanford NLP group to build a system which remedies these weaknesses. Our system differs from other graph-matching approaches by separating the problem of finding a good graph alignment from the problem of assessing inferential strength. We use a pipelined approach where alignment is followed by a classification step, in which we extract features representing high-level characteristics of the entailment problem, and pass the resulting feature vector to a statistical classifier trained on development data. I'll present results on recent RTE data sets, and highlight some challenges for the future.


    References:

    Bill MacCartney, Trond Grenager, Marie-Catherine de Marneffe, Daniel Cer, Christopher D. Manning, Learning to recognize features of valid textual entailments, to appear at NAACL-06.