Search Results for author: Mario Lučić

Found 7 papers, 1 papers with code

Video OWL-ViT: Temporally-consistent open-world localization in video

no code implementations ICCV 2023 Georg Heigold, Matthias Minderer, Alexey Gritsenko, Alex Bewley, Daniel Keysers, Mario Lučić, Fisher Yu, Thomas Kipf

Our model is end-to-end trainable on video data and enjoys improved temporal consistency compared to tracking-by-detection baselines, while retaining the open-world capabilities of the backbone detector.

Object Localization

End-to-End Spatio-Temporal Action Localisation with Video Transformers

no code implementations24 Apr 2023 Alexey Gritsenko, Xuehan Xiong, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lučić, Cordelia Schmid, Anurag Arnab

The most performant spatio-temporal action localisation models use external person proposals and complex external memory banks.

 Ranked #1 on Action Recognition on AVA v2.1 (using extra training data)

Action Detection Action Recognition +1

Object Scene Representation Transformer

no code implementations14 Jun 2022 Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.

Novel View Synthesis Representation Learning

ViViT: A Video Vision Transformer

6 code implementations ICCV 2021 Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, Cordelia Schmid

We present pure-transformer based models for video classification, drawing upon the recent success of such models in image classification.

Ranked #8 on Action Classification on Moments in Time (Top 5 Accuracy metric, using extra training data)

Action Classification Action Recognition +4

Cannot find the paper you are looking for? You can Submit a new open access paper.