Search Results for author: Bruno Korbar

Found 9 papers, 2 papers with code

Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling

no code implementations • 22 Jan 2024 • Bruno Korbar, Jaesung Huh, Andrew Zisserman

The goal of this paper is automatic character-aware subtitle generation.

Paper
Add Code

Text-Conditioned Resampler For Long Form Video Understanding

no code implementations • 19 Dec 2023 • Bruno Korbar, Yongqin Xian, Alessio Tonioni, Andrew Zisserman, Federico Tombari

In this paper we present a text-conditioned video resampler (TCR) module that uses a pre-trained and frozen visual encoder and large language model (LLM) to process long video sequences for a task.

Ranked #5 on Video Question Answering on NExT-QA

Language Modelling Large Language Model +2

Paper
Add Code

End-to-end Tracking with a Multi-query Transformer

no code implementations • 26 Oct 2022 • Bruno Korbar, Andrew Zisserman

Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.

Multiple Object Tracking Object

Paper
Add Code

Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation

1 code implementation • 9 Jul 2020 • Yongqin Xian, Bruno Korbar, Matthijs Douze, Lorenzo Torresani, Bernt Schiele, Zeynep Akata

Few-shot learning aims to recognize novel classes from a few examples.

Few-Shot Image Classification Few-Shot Learning +7

Paper
Code

Video Understanding as Machine Translation

no code implementations • 12 Jun 2020 • Bruno Korbar, Fabio Petroni, Rohit Girdhar, Lorenzo Torresani

With the advent of large-scale multimodal video datasets, especially sequences with audio or transcribed speech, there has been a growing interest in self-supervised learning of video representations.

Machine Translation Metric Learning +6

Paper
Add Code

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

1 code implementation • NeurIPS 2020 • Humam Alwassel, Dhruv Mahajan, Bruno Korbar, Lorenzo Torresani, Bernard Ghanem, Du Tran

To the best of our knowledge, XDC is the first self-supervised learning method that outperforms large-scale fully-supervised pretraining for action recognition on the same architecture.

Ranked #2 on Self-Supervised Action Recognition on UCF101 (finetuned)

Audio Classification Clustering +5

Paper
Code

SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition

no code implementations • ICCV 2019 • Bruno Korbar, Du Tran, Lorenzo Torresani

We demonstrate that the computational cost of action recognition on untrimmed videos can be dramatically reduced by invoking recognition only on these most salient clips.

Ranked #1 on Action Recognition on miniSports

Action Recognition Temporal Action Localization

Paper
Add Code

Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization

no code implementations • NeurIPS 2018 • Bruno Korbar, Du Tran, Lorenzo Torresani

There is a natural correlation between the visual and auditive elements of a video.

Ranked #7 on Self-Supervised Audio Classification on ESC-50

Audio Classification Self-Supervised Action Recognition +2

Paper
Add Code

Deep-Learning for Classification of Colorectal Polyps on Whole-Slide Images

no code implementations • 5 Mar 2017 • Bruno Korbar, Andrea M. Olofson, Allen P. Miraflor, Katherine M. Nicka, Matthew A. Suriawinata, Lorenzo Torresani, Arief A. Suriawinata, Saeed Hassanpour

In this work, we built an automatic image-understanding method that can accurately classify different types of colorectal polyps in whole-slide histology images to help pathologists with histopathological characterization and diagnosis of colorectal polyps.

General Classification whole slide images

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.