Project:
Automatic Image and Video Annotation
(ARO W911NF-08-1-0403)
PI: Anil K. Jain,
Co-PI: Rong Jin, Department of CSE, Michigan State University
Abstract:
The
explosion of digitized images and videos in recent years makes the problem of
automatic search and use of multimedia objects an extremely difficult problem.
The gigantic size of online multimedia sources, such as Youtube and Flikir,
calls for the need to develop novel methods for multimedia retrieval that
departs from conventional content-based image/video retrieval. In the proposed
project, we aim to develop methods for automatic image/video annotation that
allow users to access multimedia databases by using textual queries. The
significance of the proposed research work is that it provides additional
modality for users to effectively query, search, and browse large image/video
databases, which complements the example-based approaches for multimedia
retrieval. This additional modality will provide not only the flexibility for
multimedia retrieval but more importantly, the capability for users to express
their multimedia information needs at the semantic level. The proposed research
addresses the key challenges in automatic image/video annotations by developing
(1) a statistical framework for visual vocabulary construction that effectively
exploits the annotation information contained in training images, (2)
multi-label learning algorithms for automatic image annotation that effectively
explores the correlation among keywords, (3) robust retrieval models for noisy
auto-annotations, and (4) algorithms for automatic video annotation that
explicitly explore video-specific features.
Students
Project Goal:
The proposed research addresses the key challenges in automatic image/video annotations by developing (1) a statistical framework for visual vocabulary construction that effectively exploits the annotation information contained in training images, (2) multi-label learning algorithms for automatic image annotation that effectively explores the correlation among keywords, (3) robust retrieval models for noisy auto-annotations, and (4) algorithms for automatic video annotation that explicitly explore video-specific features.
Conference:
Serhat S. Bucak, Pavan Kumar Mallapragada, Rong Jin, and Anil K. Jain, Efficient Multi-label Ranking for Multi-class Learning: Application to Object Recognition, Proceedings of the Twelfth IEEE International Conference on Computer Vision (ICCV), 2009
Shijun Wang and Rong Jin, An Information Geometry Approach for Distance Metric Learning, Proceedings of Twelfth International Conference on Artificial Intelligence and Statistics (AI-STAT), 2009
Fengjie Li, Rong Jin, and Anil Jain, An Efficient Key Point Quantization Algorithm for Large Scale Image Retrieval, Proceedings of The 1st Workshop on Large-Scale Multimedia Retrieval and Mining (LS-MMRM) in conjunction with ACM Multimedia 2009 (The longer version of this work can be found in Technical Report MSU-CSE-09-14 of CSE Dept. at Michigan State University)
Lei Wu, Steven C.H. Hoiy, Rong Jin, Jianke Zhuz, and Nenghai Yu, Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging, Proceedings of the seventeen ACM international conference on Multimedia (ACM-Multimedia), 2009