CS 323: Understanding Images and Videos: Recognizing and Learning High-Level Visual Concepts

Fall 2009

Please scroll down for all the paper downloads.

 

 

Lecture

Date

Description

Readings

Presenter

1

Wed, Sep 23

 

Class introduction

 

 

Wed, Sep 30

Class cancelled

Make-up Session: 9am - 12pm, Fri Oct 9

 

 

2

Wed, Oct 7

Course project papers

2 students

3

Fri, Oct 9

9am - 12pm

Object recognition tutorial

Fei-Fei

4

Wed, Oct 14

Pictorial structure

 

5

Wed, Oct 21
3D object categorization

1 student

6

Wed, Oct 28

Object in context;

Project proposal due

2 students

7

Wed, Nov 4

Natural scene understanding

1 student

8

Wed, Nov 11

Total scene understanding

2 students

9

Wed, Nov 18

Human action recognition

1 student

 

Wed, Nov 25

NO CLASS,

Thanksgiving break

 

 

10

Wed, Dec 2
Video analysis

1 student

11

TBA

Course project presentation

   

 

Fri, Dec 4

Course project due

 


 

References

 

Lecture #2:

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. (2009) ImageNet: A Large-Scale Hierarchical Image Database, To Appear in IEEE Computer Vision and Pattern Recognition (CVPR).

N. Ikizler and D.A. Forsyth (2008) Searching for Complex Human Activities with No Visual Examples, Int. J. Computer Vision. Vol. 80, no. 3, pp. 337-357.


Lecture #3:

L. Fei-fei, R. Fergus, and A. Torralba. (2006) Recognizing and learning object categories, http://people.csail.mit.edu/torralba/iccv2005, Tutorial presented at ICCV 2005 pages visited Feb. 7, 2006.

 

Lecture #4:

P. Felzenszwalb, D. Huttenlocher. (2004) Efficient Graph-Based Image Segmentation, International Journal of Computer Vision (IJCV) 59(2):167-181.

P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. (2009) Object Detection with Discriminatively Trained Part-Based Models, IEEE Pattern Analysis and Machine Intelligence (PAMI). Accepted for publication.


Lecture #5:

S. Savarese and L. Fei-Fei. (2007) 3D generic object categorization, localization and pose estimation, IEEE International Conference in Computer Vision (ICCV). 2007.

*M. Sun, *H. Su, S. Savarese and L. Fei-Fei. (2009) A Multi-View Probabilistic Model for 3D Object Classes, To appear in IEEE Computer Vision and Pattern Recognition (CVPR) (*indicates equal contributions)

 

Lecture #6:

D. Hoiem, A. Efros, and M. Herbert. (2006) Putting Objects in Perspective, Proc. IEEE International Conf. Computer Vision and Pattern Recognition (CVPR).

A. Gupta, L. Davis. (2008) Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers, Proceedings of the 10th European Conference on Computer Vision: Part I.

 

Lecture #7:
L. Fei-FeiR. VanRuellen, C. Koch and P. Perona. (2005) Why does natural scene categorization require little attention? Exploring attentional requirements for natural and synthetic stimuli, Visual Cognition. 12(6): pp. 893-924.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New York, June 2006, vol. II, pp. 2169-2178.

 

Lecture #8:
L.-J. Li, R. Socher and L. Fei-Fei. (2009) Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic Framework, To appear in Computer Vision and Pattern Recognition (CVPR). (Oral)

B. Yao, X. Yang, Liang Lin, M.W. Lee, and S.C. Zhu. (2009) I2T: Image Parsing to Text Description, Proceedings of IEEE, (under review, invited for the special issue on Internet Vision).

 

Lecture #9:
I. Laptev, M. Marszałek, C. Schmid and B. Rozenfeld. (2008) Learning realistic human actions from movies, in Proc. CVPR'08, Anchorage, US.

B. Babenko, M.H. Yang, S.J. Belongie. (2009) Visual tracking with online Multiple Instance Learning, in Proc. CVPR'09, pp. 983-990.

 

Lecture #10:

A. Gupta, P. Srinivasan, J. B. Shi, L.S. Davis. (2009) Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos, in Proc. CVPR'09 pp. 2012-2019.

Y. Ke, R. Sukthankar, and M. Hebert. Event Detection in Crowded Videos,  ICCV, 2007.