Stanford University Libraries

Chemical Literature (Chem 184/284)
University of California at Santa Barbara

Lecture 1: Overview of the Organization of Information

Why a full-quarter course on chemical information?

  • Because the subject is HUGE…
  • For example: Chemical Abstracts
    • Indexes 14,000 journal titles, plus patents, conferences, reports, dissertations, etc.
    • Adds over 500,000 citations and 700,000 substance records per year.
    • From 1907 to present, over 13 million citations and 14 million substances indexed.

Information as a physical entity

  • Information can be treated as a thermodynamic system, subject to entropy (described by Claude Shannon)
  • The organization of raw data turns it into information—the better organized, the more value added.
  • Organization can be added at many levels…including the ultimate user.
  • End-user information processing puts the information in its final form for use. A new task each time—stoichiometric
  • Information professionals try to create organization in ways that can be used by many people—catalytic

Types of scientific literature

  • PRIMARY—The original publication of data: journals, patents, technical reports conferences, dissertations, some books.
  • SECONDARY—Publications which provide access to the primary literature: reviews, indexes, abstracts, data collections, etc.

Approaches to organizing the scientific literature

  • Classification and Data Collection—physically grouping related data by some common element.
  • Indexing—creating pointers to the original literature based on some piece of information in the original, e.g. author names or subject terms.
Classification & Data Collection
  • Libraries use classification schemes to group related books together for browsing by subject. In the Library of Congress system, chemistry materials fall under QD.
  • Data collections bring information from various primary sources for easier location, e.g. the CRC handbook series.
Indexing for Subject Access
  • Some indexes use keywords from the original; others use standard subject vocabularies.
  • In US libraries, terms are assigned from the Library of Congress Subject Headings (LCSH).
  • MEDLINE uses the Medical Subject Headings (MESH).
  • Chemical Abstracts uses its Index Guide.

Tradeoffs in information access

  • All information organization and retrieval involves tradeoffs
    • Specificity vs. collation
    • Relevance vs. recall
    • Maximum precision vs. maximum quantity
  • Specific headings avoid the irrelevant
  • General headings bring like items together.
  • Searching narrowly avoids having to look at irrelevant items at the cost of missing some relevant material.
  • Searching broadly helps insure that nothing is missed, but may require later screening to eliminate irrelevancies.
  • The information professional has to decide how best to meet the needs of the intended audience with the available technology.
  • The information user has to set priorities based on the ultimate objective and the time, labor and money available for searching.
  • Together, they evolve the strategy needed for extracting needed information from the universe of scientific publication.

The Iterative Approach to Literature Searching

  • Comprehensive subject searches can be tough.
  • Start with what you know—a subject term, an author, a known reference…
  • Find an initial set of relevant answers.
  • Review those answers for new clues.
  • From these, repeat the cycle until satisfied.

This page created by Chuck Huber (huber@library.ucsb.edu).