no code implementations • 22 Jan 2024 • Bruno Korbar, Jaesung Huh, Andrew Zisserman
The goal of this paper is automatic character-aware subtitle generation.
no code implementations • 19 Dec 2023 • Bruno Korbar, Yongqin Xian, Alessio Tonioni, Andrew Zisserman, Federico Tombari
In this paper we present a text-conditioned video resampler (TCR) module that uses a pre-trained and frozen visual encoder and large language model (LLM) to process long video sequences for a task.
Ranked #5 on Video Question Answering on NExT-QA
no code implementations • 26 Oct 2022 • Bruno Korbar, Andrew Zisserman
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
1 code implementation • 9 Jul 2020 • Yongqin Xian, Bruno Korbar, Matthijs Douze, Lorenzo Torresani, Bernt Schiele, Zeynep Akata
Few-shot learning aims to recognize novel classes from a few examples.
no code implementations • 12 Jun 2020 • Bruno Korbar, Fabio Petroni, Rohit Girdhar, Lorenzo Torresani
With the advent of large-scale multimodal video datasets, especially sequences with audio or transcribed speech, there has been a growing interest in self-supervised learning of video representations.
1 code implementation • NeurIPS 2020 • Humam Alwassel, Dhruv Mahajan, Bruno Korbar, Lorenzo Torresani, Bernard Ghanem, Du Tran
To the best of our knowledge, XDC is the first self-supervised learning method that outperforms large-scale fully-supervised pretraining for action recognition on the same architecture.
no code implementations • ICCV 2019 • Bruno Korbar, Du Tran, Lorenzo Torresani
We demonstrate that the computational cost of action recognition on untrimmed videos can be dramatically reduced by invoking recognition only on these most salient clips.
Ranked #1 on Action Recognition on miniSports
no code implementations • NeurIPS 2018 • Bruno Korbar, Du Tran, Lorenzo Torresani
There is a natural correlation between the visual and auditive elements of a video.
Ranked #7 on Self-Supervised Audio Classification on ESC-50
no code implementations • 5 Mar 2017 • Bruno Korbar, Andrea M. Olofson, Allen P. Miraflor, Katherine M. Nicka, Matthew A. Suriawinata, Lorenzo Torresani, Arief A. Suriawinata, Saeed Hassanpour
In this work, we built an automatic image-understanding method that can accurately classify different types of colorectal polyps in whole-slide histology images to help pathologists with histopathological characterization and diagnosis of colorectal polyps.