ICCV 2017

Learning to Reason: End-to-End Module Networks for Visual Question Answering

ICCV 2017 tensorflow/models

Natural language questions are inherently compositional, and many are most easily answered by reasoning about their decomposition into modular sub-problems.

VISUAL QUESTION ANSWERING

Mask R-CNN

ICCV 2017 matterport/Mask_RCNN

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

HUMAN PART SEGMENTATION INSTANCE SEGMENTATION KEYPOINT DETECTION MULTI-HUMAN PARSING OBJECT DETECTION SEMANTIC SEGMENTATION

Focal Loss for Dense Object Detection

ICCV 2017 marvis/pytorch-yolo2

Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

#13 best model for Object Detection on COCO

OBJECT DETECTION

Channel Pruning for Accelerating Very Deep Neural Networks

ICCV 2017 yihui-he/channel-pruning

In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction.

DSOD: Learning Deeply Supervised Object Detectors from Scratch

ICCV 2017 szq0214/DSOD

State-of-the-art object objectors rely heavily on the off-the-shelf networks pre-trained on large-scale classification datasets like ImageNet, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks.

OBJECT DETECTION

Video Frame Interpolation via Adaptive Separable Convolution

ICCV 2017 sniklaus/pytorch-sepconv

Our method develops a deep fully convolutional neural network that takes two input frames and estimates pairs of 1D kernels for all pixels simultaneously.

OPTICAL FLOW ESTIMATION VIDEO FRAME INTERPOLATION

Flow-Guided Feature Aggregation for Video Object Detection

ICCV 2017 msracver/Flow-Guided-Feature-Aggregation

The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc.

VIDEO OBJECT DETECTION VIDEO RECOGNITION

Temporal Action Detection with Structured Segment Networks

ICCV 2017 yjxiong/action-detection

Detecting actions in untrimmed videos is an important yet challenging task.

ACTION DETECTION

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

ICCV 2017 ramprs/grad-cam

We propose a technique for producing "visual explanations" for decisions from a large class of CNN-based models, making them more transparent.

IMAGE CLASSIFICATION INTERPRETABLE MACHINE LEARNING VISUAL QUESTION ANSWERING

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

ICCV 2017 Yijunmaverick/UniversalStyleTransfer

Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer.

STYLE TRANSFER