AUTOMATED RECOGNITION OF FACIAL EXPRESSIONS THROUGH 3-D TRACKING IN IMAGE SEQUENCES The human face has attracted attention in the areas such as psychology, computer vision, and computer graphics. Many computer vision researchers have been working on tracking and recognition of the whole or parts of face. However, the problem of face recognition or facial expression recognition has not totally been solved yet. The initial 2-D methods produce limited success mainly due to dependency on the camera viewing angle, yet are computationally simple[1][2]. One of the main motivation behind 3-D methods for face or expression recognition is the hope to succeed in a broader range of camera viewing angles. We would like to use the 3-D information extracted through 3-D face tracking. Among the applications of 3-D face tracking are video conferencing, interective games, model based video compression. In [3] DeCarlo and Metaxas use an optical flow based approach and achieve very accurate 3-D face tracking results by a monocular camera. in [4] Eisert and Girod show that 3-D face tracking is very suitable for video conferencing applications, and achieve very efficient and accurate compression. In [5], Gokturk et. al. brings a data-driven approach for 3-D face tracking. They have a prelimenary stage to learn possible facial deformations through stereo camera based tracking in order to replace hand created models in the previous works. This approach is suitable for face recognition and facial expression recognition mainly due to its mathematical 3-D face model. Only a couple of parameters are tracked for the pose (rotation, translation vector) and the shape (expression vector) of the! face. The expression vector is claimed to be suitable for expression recognition. A recognition system consists mainly of two components: A feature extractor and a classifier. We propose to use feature vectors through the 3-D face tracking algorithm. As a classifier, a method that automatically extracts the relevant distinguishing features is necessary. The idea of identifying and using the data points that carry the relevant information, thus focusing on the construction of the classifier itself, is utilized by Support Vector Machines (SVM). SVM, originally proposed by Vapnik [6], constructs a classifying surface that minimizes the training error and maximizes the generalization capability of the classifier [7]. It determines the data points (Support Vectors, SV) that are closest to such a surface and defines it using the SVs only. Furthermore, it deals with non-linear data by mapping the data to a higher dimension with a Kernel function, and then finding a linear optimum hyperplane in the high dimensional domain. In its original form, SVM is used to classify between two class of objects, however, expression recognition falls into a multi category classification. The solution to the multi-category classification problem can be solved in a one-to-many fashion as applied in [8] for pose classification, where each category is trained against the all other categories. In this work, we propose to construct a support vector classification system, that uses robust features from 3-D face tracking data. There will be two main contributions of the work. First, we will show that the face tracking algorithm that we have previously defined is appropriate for further applications such as facial expression recognition. Second, we would like to build the first known expression recognition system using SVM, and possibly the best :) Salih Burak Gokturk Workplan : -------- January 31 - February 7 --- Working on good feature extraction from 3D face tracking data February 7 - February 14 --- Design of the SVM system. February 14 - February 28 --- Experiments, improving the features with the feedback through the experiments. february 28 - march 7 --- writing the paper, and presenting. References : ------------- [1] W.W. Bledsoe, " Man-machine facial recognition," Panoramic Research Inc., Palo Alto, CA, 1966. [2] Y. Kaya and K. Kobayashi, " A basic study on human face recognition," in Frontiers of Pattern Recognition, 1972, p. 265. [3] D. DeCarlo and D. Metaxas, " The integration of optical flow and deformable models with aplications to human face shape and motion estimation," Proceedings CVPR'96, pages 231-238,1996. [4] P. Eisert and B. Girod, " Analyzing facial expressions for Virtual Conferencing," IEEE Computer Graphics & Applications: Special Issue: Computer Animation for Virtual Humans, vol. 18, no. 5, pp. 70-78, September 1998. [5] S.B. Gokturk, J. Bouguet, R. Grzeszczuk, " A Data Driven Model for Monocular Face Tracking," Submitted to International Conference on Computer Vision (ICCV) 2001. [6] Vapnik V, The Nature of Statistical Learning Theory, New York, Springer-Verlag, 1995. [7] http://www.support-vector.net [8] E. Ardizzone, A.Chella, R. Pirrone, " Pose Classification Using Support Vector Machines," Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on, pp. 317 - 322 vol.6 [9] D. Terzopoulos, and K. Waters, " Analysis and Synthesis of facial image sequences using physical and anatomical models," IEEE Transactions on Pattern Analysis and Machine Intelligencem Volume: 156, pp. 569-579, June 1993.