Stanford Research Communication Program
  Home   Researchers Professionals  About
 
Archive by Major Area

Engineering
Humanities

Social Science

Natural Science

Archive by Year

Fall 1999 - Spring 2000

Fall 2000 - Summer 2001

Fall 2001 - Spring 2002

Fall 2002 - Summer 2003


 

 

 


I-RITE Statement Archive
About I-RITE

Who's Responsible for This Mess? Tracing Environmental Contamination Back to Its Source

Anna Michalak
Department of Civil and Environmental Engineering
Stanford University
June 2002


When environmental contamination is observed, there are often several potentially responsible sources. Think, for example, of being handed a creamy cup of coffee and being asked to determine into which side of the mug the cream was originally poured. This is essentially the problem that I am trying to solve, but for environmental contaminants released into soil, water or air. Unlike with the creamy coffee, it is usually possible to estimate the location or release history of a source of environmental contamination, since the contaminant is generally not completely mixed in the environment. However, because there is often little available information, and this data is not always very good, and also because contaminants do tend to mix in the environment after they are released, it is often impossible to identify the exact source of contamination. My work focuses on developing statistical methods that not only allow you to get an estimate of the source, i.e. the location of the source or the release history from a known source, but also allow you to quantify the uncertainty associated with this estimate. In other words, these methods tell you to what extent it is possible to define the source of a contaminant. Although my current work focuses on groundwater contamination, the methods are also applicable to other types of contamination.

Because of the mixing and data quantity/quality issues that I have already mentioned, the problem of identifying the contaminant source is what is termed "ill-posed." If there are slight errors in the information about the current distribution of the contaminant (i.e. concentration measurements at individual points in a polluted lake) or in our conceptual understanding of how the contaminant moves in the environment, these errors will have a very large impact on the source estimate that best reproduces the current distribution of the contaminant. This problem is also often "underdetermined," meaning the number of available measurements is smaller than the number of points at which we want to estimate the source. If you think back to your high-school algebra class, you may remember that if you have two unknowns, you need two equations to find a solution. In the problems that I try to solve, I typically have hundreds of unknowns, but only tens of equations.

The way that I deal with the ill-posed and underdetermined nature of the problem is by combining "prior" information with "likelihood." Prior information corresponds to some type of information about how the source of the contaminant may have varied in space or time. For example, in some cases, it may be known that the contaminant release occurred in several short spurts or, conversely, as a long-term, slowly varying source. This prior information does not itself assign probabilities to individual sources, but simply makes an attempt at introducing very general information that represents our understanding of how the contamination may have occurred. The likelihood component is where I make sure that whatever source estimate I come up with actually matches the available measurements. By combining the prior information with the likelihood, I can come up with both an estimate of the contaminant source and quantify how certain this estimate is.

My methods differ from those commonly being applied for contaminant source identification specifically because they provide this additional information about the uncertainty associated with estimates. This is a key issue. What most other methods focus on is finding a single estimate by trying to reproduce available observations. Think back to the coffee example that I first mentioned. If two expert witnesses were hired to determine where the cream was poured, one witness may say: "I believe the cream was poured into the left side of the mug because if I pour cream into the left side of a mug of coffee and I stir, then I end up with a creamy cup of coffee that looks exactly like the one we have here." The second expert witness could make the identical argument for the right side of the mug. But who is right? In this case, the available information, which is the creamy cup of coffee, is not sufficient to distinguish between these two scenarios. It is important to recognize this uncertainty, so that we know how much faith to put into a given estimate. If we do not quantify this uncertainty, there is no way of telling whether the available observations are sufficient to truly give us an answer as to the source of contamination.