Scene recognition and understanding

Understanding complex visual scenes is one of the hallmark tasks of computer vision. Given a picture or a video, the goal of scene understanding is to build a representation of the content of a picture (e.g., what are the objects inside the picture, how are they related, if there are people in the picture what actions are they performing, what is the place depicted in the picture, etc).

With the appearance of large scale databases like ImageNet [1] and Places [2], and the recent success of machine learning techniques such as Deep Neural Networks [3], scene understanding has experienced a large amount of progress, making possible to build vision systems capable of addressing some of the mentioned tasks.

In this research line, in collaboration with the computer vision group at the Massachusetts Institute of Technology, our goal is to improve existing algorithms for scene understanding and to define new problems that become reachable now, thanks to the recent advances in neural networks and machine learning.

For more information please contact alapedriza@uoc.edu

[1] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierar- chical image database. In Proc. CVPR, 2009.

[2] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. "Learning Deep Features for Scene Recognition using Places Database." Advances in Neural Information Processing Systems 27 (NIPS), 2014.

[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In In Advances in Neural Information Processing Systems, 2012.