When Do We Have Enough Information To Classify?

Maya Gupta (Google Research)
Date: Oct. 25th, 2013

Abstract

We'll look at one theoretical and one practical aspect of the question ``When do we have enough information to classify a particular sample?" From the theoretical side, we show that standard Gaussian assumptions for class-conditional distributions are surprisingly bad in terms of the worst-case Bayes classification error. We show how to construct bounds on the worst-case Bayes error when moments of the class-conditional distributions are known, and highlight some open questions. From the more practical side, we consider the problem of classifying a time series as soon as possible, or equivalently classifying a sample with as little computation of features as possible. We use generative classifiers to optimize the dual objectives of providing a class label as early as possible but also guaranteeing with high probability that the early class matches the class that would be assigned to a longer time series (or more complete feature set).

Bio

Gupta manages a machine learning research group at Google Research. From 2003-2012, Gupta was an Associate Professor of Electrical Engineering at the University of Washington. In 2007, Gupta received the PECASE award from Pres. George Bush, and the 2007 Office of Naval Research YIP Award. Her Ph.D. in Electrical Engineering is from Stanford University (2003), where she was a National Science Foundation Graduate Fellow, and worked with Bob Gray, Richard Olshen, and Rob Tibshirani. Gupta has also worked for Ricoh Research, NATO's Underwater Research Center (NURC), HP R&D, AT&T Labs, and Microsoft, and runs Artifact Puzzles.