Browse > Computer Vision > Video Captioning

Video Captioning

19 papers with code · Computer Vision

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning

9 Jul 2018lvapeab/nmt-keras

We present NMT-Keras, a flexible toolkit for training deep learning models, which puts a particular emphasis on the development of advanced applications of neural machine translation systems, such as interactive-predictive translation protocols and long-term adaptation of the translation system via continuous learning.

MACHINE TRANSLATION QUESTION ANSWERING SENTENCE CLASSIFICATION VIDEO CAPTIONING VISUAL QUESTION ANSWERING

ECO: Efficient Convolutional Network for Online Video Understanding

ECCV 2018 mzolfaghari/ECO-efficient-video-understanding

In this paper, we introduce a network architecture that takes long-term content into account and enables fast per-video processing at the same time.

#11 best model for Action Recognition In Videos on Something-Something V1 (using extra training data)

ACTION CLASSIFICATION ACTION RECOGNITION IN VIDEOS VIDEO CAPTIONING VIDEO RETRIEVAL VIDEO UNDERSTANDING

Delving Deeper into Convolutional Networks for Learning Video Representations

19 Nov 2015yaoli/arctic-capgen-vid

We propose an approach to learn spatio-temporal features in videos from intermediate visual representations we call "percepts" using Gated-Recurrent-Unit Recurrent Networks (GRUs). Our method relies on percepts that are extracted from all level of a deep convolutional network trained on the large ImageNet dataset.

VIDEO CAPTIONING ZERO-SHOT ACTION RECOGNITION

Oracle performance for visual captioning

14 Nov 2015yaoli/arctic-capgen-vid

The task of associating images and videos with a natural language description has attracted a great amount of attention recently.

IMAGE CAPTIONING LANGUAGE MODELLING VIDEO CAPTIONING

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data

CVPR 2016 LisaAnne/DCC

Current deep caption models can only describe objects contained in paired image-sentence corpora, despite the fact that they are pre-trained with large object recognition datasets, namely ImageNet.

IMAGE CAPTIONING OBJECT RECOGNITION VIDEO CAPTIONING

Temporal Tessellation: A Unified Approach for Video Analysis

ICCV 2017 dot27/temporal-tessellation

A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video.

ACTION DETECTION VIDEO CAPTIONING VIDEO SUMMARIZATION VIDEO UNDERSTANDING

End-to-End Dense Video Captioning with Masked Transformer

CVPR 2018 salesforce/densecap

To address this problem, we propose an end-to-end transformer model for dense video captioning.

DENSE VIDEO CAPTIONING

Video Description using Bidirectional Recurrent Neural Networks

12 Apr 2016lvapeab/ABiViRNet

Although traditionally used in the machine translation field, the encoder-decoder framework has been recently applied for the generation of video and image descriptions.

TEXT GENERATION VIDEO CAPTIONING VIDEO DESCRIPTION

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

17 Aug 2016rakshithShetty/captionGAN

We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset.

VIDEO CAPTIONING

Video captioning with recurrent networks based on frame- and video-level features and visual content classification

9 Dec 2015rakshithShetty/captionGAN

In this paper, we describe the system for generating textual descriptions of short video clips using recurrent neural networks (RNN), which we used while participating in the Large Scale Movie Description Challenge 2015 in ICCV 2015.

IMAGE CAPTIONING VIDEO CAPTIONING