My research intests are related to Computer Vision, Natural Language Processing, Affective Computing, Explainable Artificial Intelligence (AI), and Fairness in AI. I'm also interested in the applications of these research fields to Health, Social Robotics and, more generally, Artificial Intelligence for social good.
I've been collaborating with MIT since 2012. From 2012 to 2015 I was a visiting professor at MIT CSAIL, where I worked on Scene Recognition and Interpretable Models in Computer Vision with Prof. Antonio Torralba. From 2017 to 2020 I was a visiting professor at MIT Medialab Affective Computing group, where I worked on emotion perception and social robotics with Prof. Rosalind Picard. More recently (2020-2021) I've been a visiting researcher at Google (USA).
To check my publications please visit my Google Scholar profile.
Check our work on CCN Interpretability (accepted at PNAS 2020)An analytic framework to systematically identify the semantics of individual hidden units within image classification and image generation networks (PDF).
Our work on emotionally-aware chat-bots accepted at NeurIPS 2019Our paper proposes new methodology on how to evaluate open-domain dialog systems (PDF and available code repository). Check also our paper on using user's feedback in an off-policy reinforcement learning setting to improve the quality of the bots (PDF).
"Context Based Emotion Recognition using EMOTIC dataset" (at TPAMI 2019)Extended dataset and extended experiments with different types of context features and loss functions on IEEE Transactions on Pattern Analysis and Machine Intelligence (PDF). The second release of the Emotic dataset is available at the website of the Emotic project.
Our work on "Class Activation Map" accepted at CVPR 2016We revisit the global average pooling layer and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels (Project Page).
Scene recognition demoGiven a picture our system predicts the scene category and some other attibutes. It also provides a heatmap that indicates the region of the image that supports the ouputs.
Understanding the representations learned by CNNsWe found that object detectors emerged in a CNN trained for scene recognition. For more information check our paper: B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. “Object Detectors Emerge in Deep Scene CNNs.” International Conference on Learning Representations (ICLR) oral, 2015. (PDF).
Project page of Places DatabaseYou can download the database and the pretrained network PlacesCNN. More details can be found in our paper: B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. “Learning Deep Features for Scene Recognition using Places Database.” Advances in Neural Information Processing Systems 27 (NeuIPS), 2014. (PDF).
Universitat Oberta de Catalunya,
Estudis d'Informàtica, Multimèdia i Telecomunicació
Rambla del Poblenou, 156
08018 Barcelona (Spain)