CVPR 2016

End-to-end people detection in crowded scenes

CVPR 2016 TensorBox/TensorBox

Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals.

Structure-From-Motion Revisited

CVPR 2016 colmap/colmap

Incremental Structure-from-Motion is a prevalent strategy for 3D reconstruction from unordered image collections.

3D RECONSTRUCTION

Learning Deep Features for Discriminative Localization

CVPR 2016 metalbubble/CAM

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels.

OBJECT LOCALIZATION

Real-time Action Recognition with Enhanced Motion Vector CNNs

CVPR 2016 yjxiong/caffe

The deep two-stream architecture exhibited excellent performance on video based action recognition.

ACTION RECOGNITION OPTICAL FLOW ESTIMATION

Convolutional Two-Stream Network Fusion for Video Action Recognition

CVPR 2016 feichtenhofer/twostreamfusion

Recent applications of Convolutional Neural Networks (ConvNets) for human action recognition in videos have proposed different solutions for incorporating the appearance and motion information.

ACTION RECOGNITION IN VIDEOS

Neural Module Networks

CVPR 2016 jacobandreas/nmn2

Visual question answering is fundamentally compositional in nature---a question like "where is the dog?"

VISUAL QUESTION ANSWERING

Seven ways to improve example-based single image super resolution

CVPR 2016 jiny2001/dcscn-super-resolution

In this paper we present seven techniques that everybody should know to improve example-based single image super resolution (SR): 1) augmentation of data, 2) use of large dictionaries with efficient search structures, 3) cascading, 4) image self-similarities, 5) back projection refinement, 6) enhanced prediction by consistency check, and 7) context reasoning.

IMAGE SUPER-RESOLUTION

Learning Multi-Domain Convolutional Neural Networks for Visual Tracking

CVPR 2016 HyeonseobNam/py-MDNet

Our algorithm pretrains a CNN using a large set of videos with tracking ground-truths to obtain a generic target representation.

VISUAL TRACKING