no code implementations • 20 Dec 2022 • Evgenya Pergament, Pulkit Tandon, Oren Rippel, Lubomir Bourdev, Alexander G. Anderson, Bruno Olshausen, Tsachy Weissman, Sachin Katti, Kedar Tatwawadi
The contributions of this work are threefold: (1) we introduce a web-tool which allows scalable collection of fine-grained perceptual importance, by having users interactively paint spatio-temporal maps over encoded videos; (2) we use this tool to collect a dataset with 178 videos with a total of 14443 frames of human annotated spatio-temporal importance maps over the videos; and (3) we use our curated dataset to train a lightweight machine learning model which can predict these spatio-temporal importance regions.
1 code implementation • 8 May 2022 • Evgenya Pergament, Pulkit Tandon, Kedar Tatwawadi, Oren Rippel, Lubomir Bourdev, Bruno Olshausen, Tsachy Weissman, Sachin Katti, Alexander G. Anderson
We use this tool to collect data in-the-wild (10 videos, 17 users) and utilize the obtained importance maps in the context of x264 coding to demonstrate that the tool can indeed be used to generate videos which, at the same bitrate, look perceptually better through a subjective study - and are 1. 9 times more likely to be preferred by viewers.
no code implementations • ICCV 2021 • Oren Rippel, Alexander G. Anderson, Kedar Tatwawadi, Sanjay Nair, Craig Lytle, Lubomir Bourdev
In this setting, for natural videos our approach compares favorably across the entire R-D curve under metrics PSNR, MS-SSIM and VMAF against all mainstream video standards (H. 264, H. 265, AV1) and all ML codecs.
no code implementations • ICCV 2019 • Oren Rippel, Sanjay Nair, Carissa Lew, Steve Branson, Alexander G. Anderson, Lubomir Bourdev
We present a new algorithm for video coding, learned end-to-end for the low-latency mode.
no code implementations • ICML 2017 • Oren Rippel, Lubomir Bourdev
We present a machine learning-based approach to lossy image compression which outperforms all existing codecs, while running in real-time.
no code implementations • 20 Nov 2015 • Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri
Over the last few years deep learning methods have emerged as one of the most prominent approaches for video analysis.
2 code implementations • 18 Nov 2015 • Oren Rippel, Manohar Paluri, Piotr Dollar, Lubomir Bourdev
Beyond classification, we further validate the saliency of the learnt representations via their attribute concentration and hierarchy recovery properties, achieving 10-25% relative gains on the softmax classifier and 25-50% on triplet loss in these tasks.
no code implementations • CVPR 2016 • Chen Sun, Manohar Paluri, Ronan Collobert, Ram Nevatia, Lubomir Bourdev
This paper aims to classify and locate objects accurately and efficiently, without using bounding box annotations.
Ranked #5 on
Weakly Supervised Object Detection
on MS COCO
no code implementations • CVPR 2015 • Yunchao Gong, Marcin Pawlowski, Fei Yang, Louis Brandy, Lubomir Bourdev, Rob Fergus
In addition, we propose an online clustering method based on binary k-means that is capable of clustering large photo stream on a single machine, and show applications to spam detection and trending photo discovery.
no code implementations • ICCV 2015 • Kevin Tang, Manohar Paluri, Li Fei-Fei, Rob Fergus, Lubomir Bourdev
With the widespread availability of cellphones and cameras that have GPS capabilities, it is common for images being uploaded to the Internet today to have GPS coordinates associated with them.
no code implementations • CVPR 2015 • Ning Zhang, Manohar Paluri, Yaniv Taigman, Rob Fergus, Lubomir Bourdev
We explore the task of recognizing peoples' identities in photo albums in an unconstrained setting.
no code implementations • 18 Dec 2014 • Yunchao Gong, Liu Liu, Ming Yang, Lubomir Bourdev
In this paper, we tackle this model storage issue by investigating information theoretical vector quantization methods for compressing the parameters of CNNs.
29 code implementations • ICCV 2015 • Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri
We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset.
Ranked #8 on
Action Recognition
on Sports-1M
Action Recognition In Videos
Dynamic Facial Expression Recognition
no code implementations • 2 Jul 2014 • Lubomir Bourdev, Fei Yang, Rob Fergus
We train the poselet model on top of PDF features and combine them with object-level CNNs for detection and bounding box prediction.
no code implementations • 9 Jun 2014 • Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, Rob Fergus
The availability of large labeled datasets has allowed Convolutional Network models to achieve impressive recognition results.
37 code implementations • 1 May 2014 • Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, Piotr Dollár
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.
1 code implementation • CVPR 2014 • Ning Zhang, Manohar Paluri, Marc'Aurelio Ranzato, Trevor Darrell, Lubomir Bourdev
We propose a method for inferring human attributes (such as gender, hair style, clothes style, expression, action) from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion.
Ranked #7 on
Facial Attribute Classification
on LFWA
no code implementations • CVPR 2013 • Georgia Gkioxari, Pablo Arbelaez, Lubomir Bourdev, Jitendra Malik
We propose a novel approach for human pose estimation in real-world cluttered scenes, and focus on the challenging problem of predicting the pose of both arms for each person in the image.