CSLI Publications logo
new books
catalog
series
knuth books
contact
for authors
order
search
CSLI Publications
Facebook CSLI Publications RSS feed
CSLI Publications Newsletter Signup Button
 
The Structure of Scientific Articles cover

The Structure of Scientific Articles

Applications to Citation Indexing and Summarization

Simone Teufel

Finding a particular scientific document amidst a sea of thousands of other documents can often seem like an insurmountable task. The Structure of Scientific Articles shows how linguistic theory can provide a solution by analyzing rhetorical structures to make information retrieval easier and faster.

Through the use of an improved citation indexing system, this indispensable volume applies empirical discourse studies to pressing issues of document management, including attribution, the author's stance towards other work, and problem-solving processes.

Simone Teufel is senior lecturer at the natural language and information processing group at the University of Cambridge computer laboratory.

Contents

  • 1 Introduction
    • 1.1 Text Understanding and Information Management
    • 1.2 Discourse Structure and Scientific Argument
    • 1.3 Outline of this Book

  • 2 Information Retrieval and Citation Indexes
    • 2.1 Information Needs in Science
    • 2.2 Keyword-Based Search
      • 2.2.1 Information Retrieval Methods
      • 2.2.2 Evaluation of Information Retrieval Systems
    • 2.3 Citation-Based Search
      • 2.3.1 The Citation System and Bibliometry
      • 2.3.2 Citation Indexes and Search

  • 3 Summarisation
    • 3.1 Human Summarisation
      • 3.1.1 Summary Journals and Professional Abstractors
      • 3.1.2 Structure in Abstracts
    • 3.2 Automatic Summarisation
      • 3.2.1 Fact Extraction Methods
      • 3.2.2 Text Extraction Methods

  • 4 New Methods for Information Access
    • 4.1 Rhetorical Extracts
    • 4.2 Citation Maps

  • 5 Experimental Corpora
    • 5.1 Computational Linguistics (CmpLG)
      • 5.1.1 Source
      • 5.1.2 Properties
      • 5.1.3 Citation behaviour
    • 5.2 Chemistry
    • 5.3 Genetics, Cardiology, Agriculture
    • 5.4 SciXML
      • 5.4.1 Description
      • 5.4.2 Transformation from Source Formats

  • 6 The Knowledge Claim Discourse Model (KCDM)
    • 6.1 Overview of the Model
    • 6.2 Level 0: Goals in Argumentation
    • 6.3 Level 1: Rhetorical Moves
    • 6.4 Level 2: Knowledge Claim Attribution
    • 6.5 Level 3: Hinging
    • 6.6 Level 4: Linearisation and Presentation
    • 6.7 Traditional Intention-Based Discourse Models

  • 7 Annotation Scheme Design
    • 7.1 Fundamental Concepts
    • 7.2 The KCA Scheme (Knowledge Claim Attribution)
    • 7.3 The CFC Scheme (Citation Function Classification)
    • 7.4 The AZ Scheme (Argumentative Zoning)
    • 7.5 Alternative Scheme Definitions

  • 8 Reliability Studies
    • 8.1 Agreement Metrics, Ceilings and Baselines
    • 8.2 Study I: Knowledge Claim Attribution (KCA)
    • 8.3 Study II: Argumentative Zoning (AZ)
    • 8.4 Study III: Argumentative Zoning, Untrained
    • 8.5 Study IV: Citation Function Classification (CFC)
    • 8.6 Post-Hoc Analyses of Study II Data

  • 9 Meta-Discourse
    • 9.1 Actions/States
    • 9.2 Agents/Entities
    • 9.3 Significance for Text Understanding
    • 9.4 Practical Issues
      • 9.4.1 Agent- and Action Recognition in Meta-Discourse
      • 9.4.2 Ambiguous Mentions of Entities
      • 9.4.3 Lexical Equivalence
    • 9.5 Use of Meta-Discourse in the Literature
    • 9.6 Cross-Discipline Differences in Meta-Discourse

  • 10 Features
    • 10.1 Entity-Based Meta-Discourse (Ent)
    • 10.2 Action-Based Meta-Discourse (Act)
    • 10.3 Formulaic Meta-Discourse (Formu, F-Strength, Formu-XXX)
    • 10.4 Scientific Attribution (SciAtt-X)
    • 10.5 Citations (Cit)
    • 10.6 Tense, Voice and Aspect (Syn)
    • 10.7 Category History (Hist)
    • 10.8 Structural Indicators (Loc, Struct)
    • 10.9 Content and Sentence Length (Cont, Len)

  • 11 Automatic AZ, KCA and CFC
    • 11.1 Feature Determination
    • 11.2 Statistical Classification

  • 12 Evaluation
    • 12.1 Intrinsic Evaluation
      • 12.1.1 Automatic AZ
      • 12.1.2 Automatic KCA
      • 12.1.3 Automatic CFC
    • 12.2 Extrinsic Evaluation (AZ)
      • 12.2.1 Experimental Design
      • 12.2.2 Results

  • 13 Applying the KCDM to Other Disciplines
    • 13.1 Application to Chemistry
      • 13.1.1 Domain Knowledge-Free Annotation
      • 13.1.2 Argumentative Zoning II (AZ-II)
    • 13.2 Variant AZ-Schemes
      • 13.2.1 For Computer Science (Feltrim et al.)
      • 13.2.2 For Biology (Mizuta and Collier)
      • 13.2.3 For Astrophysics (Merity et al.)
      • 13.2.4 For Legal Texts (Hachey and Grover)
    • 13.3 Automatic Meta-Discourse Discovery

  • 14 Outlook
    • 14.1 Support Tools for Scientific Writing
    • 14.2 Automatic Review Generation
    • 14.3 Scientific Summaries Beyond Extraction
    • 14.4 Digital Libraries and Robust AZ

  • 15 Conclusions
    • 15.1 An Interdisciplinary Project
    • 15.2 Limitations

  • A CmpLG-D Articles
  • B DTD for SciXML
  • C Guidelines
    • C.1 KCS Guidelines (1998)
    • C.2 AZ Guidelines (1998)
    • C.3 CFC Guidelines; Excerpt (2005)

  • D Lexical Resources
    • D.1 Concept Lexicon
    • D.2 Formulaic Patterns
    • D.3 Entity Patterns
    • D.4 Action Lexicon

  • References
  • Author Index
  • Index

July 2010

ISBN (Paperback): 9781575865560
ISBN (Cloth): 9781575865553
ISBN (Electronic): 9781575867328

Add to Cart
View Cart

Check Out

Distributed by the
University of
Chicago Press

pubs @ csli.stanford.edu