AUTOMATED RECOGNITION OF FACIAL EXPRESSIONS THROUGH 3-D TRACKING
IN IMAGE SEQUENCES 


The human face has attracted attention in the areas such as psychology,
computer vision, and computer graphics. Many computer vision researchers
have been working on tracking and recognition of the whole or parts of face.
However, the problem of face recognition or facial expression recognition
has not totally been solved yet. The initial 2-D methods produce limited
success mainly due to dependency on the camera viewing angle, yet are
computationally simple[1][2]. One of the main motivation behind 3-D methods
for face or expression recognition is the hope to succeed in a broader range
of camera viewing angles. 

We would like to use the 3-D information extracted through 3-D face
tracking. Among the applications of 3-D face tracking are video conferencing,
interective games, model based video compression. In [3] DeCarlo and
Metaxas use an optical flow based approach and achieve very accurate 3-D
face tracking results by a monocular camera. in [4] Eisert and Girod show
that 3-D face tracking is very suitable for video conferencing applications,
and achieve very efficient and accurate compression. In [5], Gokturk et. al.
brings a data-driven approach for 3-D face tracking. They have a prelimenary
stage to learn possible facial deformations through stereo camera based
tracking in order to replace hand created models in the previous works.
This approach is suitable for face recognition and facial expression
recognition mainly due to its mathematical 3-D face model. Only a couple
of parameters are tracked for the pose (rotation, translation vector) and
the shape (expression vector) of the!
  face. The expression vector is claimed to be suitable for expression
recognition.

A recognition system consists mainly of two components: A feature
extractor and a classifier. We propose to use feature vectors through
the 3-D face tracking algorithm. As a classifier, a method that
automatically extracts the relevant distinguishing features is necessary.
The idea of identifying and using the data points that carry the relevant
information, thus focusing on the construction of the classifier itself,
is utilized by Support Vector Machines (SVM). SVM, originally proposed by
Vapnik [6], constructs a classifying surface that minimizes the training
error and maximizes the generalization capability of the classifier [7].
It determines the data points (Support Vectors, SV) that are closest to
such a surface and defines it using the SVs only. Furthermore, it deals
with non-linear data by mapping the data to a higher dimension with a
Kernel function, and then finding a linear optimum hyperplane in the high
dimensional domain. 

In its original form, SVM is used to classify between two class of objects,
however, expression recognition falls into a multi category classification. 
The solution to the multi-category classification problem can be solved in
a one-to-many fashion as applied in [8] for pose classification, where each
category is trained against the all other categories. 

In this work, we propose to construct a support vector classification system,
that uses robust features from 3-D face tracking data. There will be two
main contributions of the work. First, we will show that the face tracking
algorithm that we have previously defined is appropriate for further
applications such as facial expression recognition. Second, we would
like to build the first known expression recognition system using SVM,
and possibly the best :)

Salih Burak Gokturk

Workplan : 
--------

January 31 - February 7 
--- Working on good feature extraction from 3D face tracking data
February 7 - February 14 
--- Design of the SVM system.
February 14 - February 28 
--- Experiments, improving the features with the feedback through the experiments.
february 28 - march 7 
--- writing the paper, and presenting.

References : 
-------------

[1]  W.W. Bledsoe, " Man-machine facial recognition," Panoramic Research Inc.,
Palo Alto, CA, 1966.
[2] Y. Kaya and K. Kobayashi, " A basic study on human face recognition,"
in Frontiers of Pattern Recognition, 1972, p. 265.
[3] D. DeCarlo and D. Metaxas, " The integration of optical flow and
deformable models with aplications to human face shape and motion estimation,"
Proceedings CVPR'96, pages 231-238,1996. 
[4] P. Eisert and B. Girod, " Analyzing facial expressions for Virtual
Conferencing," IEEE Computer Graphics & Applications: Special Issue: Computer
Animation for Virtual Humans, vol. 18, no. 5, pp. 70-78, September 1998.
[5] S.B. Gokturk, J. Bouguet, R. Grzeszczuk, " A Data Driven Model for
Monocular Face Tracking," Submitted to International Conference on Computer
Vision (ICCV) 2001.  
[6] Vapnik V, The Nature of Statistical Learning Theory, New York,
Springer-Verlag, 1995.
[7] http://www.support-vector.net
[8] E. Ardizzone, A.Chella, R. Pirrone, " Pose Classification Using Support
Vector Machines," Neural Networks, 2000. IJCNN 2000, Proceedings of the
IEEE-INNS-ENNS International Joint Conference on, pp. 317 - 322 vol.6
[9] D. Terzopoulos, and K. Waters, " Analysis and Synthesis of facial
image sequences using physical and anatomical models," IEEE Transactions
on Pattern Analysis and Machine Intelligencem Volume: 156, pp. 569-579,
June 1993.