Delving Deeper into Convolutional Networks for Learning Video Representations

19 Nov 2015 Nicolas Ballas Li Yao Chris Pal Aaron Courville

We propose an approach to learn spatio-temporal features in videos from intermediate visual representations we call "percepts" using Gated-Recurrent-Unit Recurrent Networks (GRUs).Our method relies on percepts that are extracted from all level of a deep convolutional network trained on the large ImageNet dataset. While high-level percepts contain highly discriminative information, they tend to have a low-spatial resolution... (read more)

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
ReLU
Activation Functions
Sigmoid Activation
Activation Functions
CGRU
Recurrent Neural Networks
Convolution
Convolutions
GRU
Recurrent Neural Networks