Project: Automatic Image and Video Annotation   (ARO W911NF-08-1-0403)

 

PI: Anil K. Jain, Department of CSE, Michigan State University

Co-PI: Rong Jin, Department of CSE, Michigan State University

 

Abstract:

The explosion of digitized images and videos in recent years makes the problem of automatic search and use of multimedia objects an extremely difficult problem. The gigantic size of online multimedia sources, such as Youtube and Flikir, calls for the need to develop novel methods for multimedia retrieval that departs from conventional content-based image/video retrieval. In the proposed project, we aim to develop methods for automatic image/video annotation that allow users to access multimedia databases by using textual queries. The significance of the proposed research work is that it provides additional modality for users to effectively query, search, and browse large image/video databases, which complements the example-based approaches for multimedia retrieval. This additional modality will provide not only the flexibility for multimedia retrieval but more importantly, the capability for users to express their multimedia information needs at the semantic level. The proposed research addresses the key challenges in automatic image/video annotations by developing (1) a statistical framework for visual vocabulary construction that effectively exploits the annotation information contained in training images, (2) multi-label learning algorithms for automatic image annotation that effectively explores the correlation among keywords, (3) robust retrieval models for noisy auto-annotations, and (4) algorithms for automatic video annotation that explicitly explore video-specific features.
 

Students

  1. Wei Tong

 

Project Goal:

The proposed research addresses the key challenges in automatic image/video annotations by developing (1) a statistical framework for visual vocabulary construction that effectively exploits the annotation information contained in training images, (2) multi-label learning algorithms for automatic image annotation that effectively explores the correlation among keywords, (3) robust retrieval models for noisy auto-annotations, and (4) algorithms for automatic video annotation that explicitly explore video-specific features.

 

Conference:

  1. Serhat S. Bucak, Pavan Kumar Mallapragada, Rong Jin, and Anil K. Jain, Efficient Multi-label Ranking for Multi-class Learning: Application to Object Recognition, Proceedings of the Twelfth IEEE International Conference on Computer Vision (ICCV), 2009

  2. Shijun Wang and Rong Jin, An Information Geometry Approach for Distance Metric Learning, Proceedings of Twelfth International Conference on Artificial Intelligence and Statistics (AI-STAT), 2009

  3. Fengjie Li, Rong Jin, and Anil Jain, An Efficient Key Point Quantization Algorithm for Large Scale Image Retrieval, Proceedings of The 1st Workshop on Large-Scale Multimedia Retrieval and Mining (LS-MMRM) in conjunction with ACM Multimedia 2009 (The longer version of this work can be found in Technical Report MSU-CSE-09-14 of CSE Dept. at Michigan State University)

  4. Lei Wu, Steven C.H. Hoiy, Rong Jin, Jianke Zhuz, and Nenghai Yu, Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging, Proceedings of the seventeen ACM international conference on Multimedia (ACM-Multimedia), 2009