1 code implementation • 23 Dec 2017 • Aljoša Ošep, Paul Voigtlaender, Jonathon Luiten, Stefan Breuers, Bastian Leibe
We explore object discovery and detector adaptation based on unlabeled video sequences captured from a mobile platform.
5 code implementations • 24 Jul 2018 • Jonathon Luiten, Paul Voigtlaender, Bastian Leibe
We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations.
no code implementations • 19 Sep 2018 • Aljosa Osep, Paul Voigtlaender, Jonathon Luiten, Stefan Breuers, Bastian Leibe
We propose to leverage a generic object tracker in order to perform object mining in large-scale unlabeled videos, captured in a realistic automotive setting.
1 code implementation • 26 Jan 2019 • Aljosa Osep, Paul Voigtlaender, Mark Weber, Jonathon Luiten, Bastian Leibe
Many high-level video understanding methods require input in the form of object proposals.
no code implementations • CVPR 2019 • Paul Voigtlaender, Michael Krause, Aljosa Osep, Jonathon Luiten, Berin Balachandar Gnana Sekar, Andreas Geiger, Bastian Leibe
This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS).
Ranked #6 on Multi-Object Tracking on MOTS20
Multi-Object Tracking Multi-Object Tracking and Segmentation +3
no code implementations • 28 Feb 2019 • Aljosa Osep, Paul Voigtlaender, Jonathon Luiten, Stefan Breuers, Bastian Leibe
This paper addresses the problem of object discovery from unlabeled driving videos captured in a realistic automotive setting.
no code implementations • 9 Apr 2019 • Paul Voigtlaender, Jonathon Luiten, Bastian Leibe
Following this paradigm, we present BoLTVOS (Box-Level Tracking for VOS), which consists of an R-CNN detector conditioned on the first-frame bounding box to detect the object of interest, a temporal consistency rescoring algorithm, and a Box2Seg network that converts bounding boxes to segmentation masks.
1 code implementation • 30 Sep 2019 • Jonathon Luiten, Tobias Fischer, Bastian Leibe
Object tracking and 3D reconstruction are often performed together, with tracking used as input for reconstruction.
no code implementations • 2 Nov 2019 • Mark Weber, Jonathon Luiten, Bastian Leibe
We present a novel end-to-end single-shot method that segments countable object instances (things) as well as background regions (stuff) into a non-overlapping panoptic segmentation at almost video frame rate.
1 code implementation • CVPR 2020 • Paul Voigtlaender, Jonathon Luiten, Philip H. S. Torr, Bastian Leibe
We present Siam R-CNN, a Siamese re-detection architecture which unleashes the full power of two-stage object detection approaches for visual object tracking.
Ranked #5 on Object Tracking on COESOT
1 code implementation • 15 Jan 2020 • Jonathon Luiten, Idil Esen Zulfikar, Bastian Leibe
UnOVOST even performs competitively with many semi-supervised video object segmentation algorithms even though it is not given any input as to which objects should be tracked and segmented.
5 code implementations • 16 Sep 2020 • Jonathon Luiten, Aljosa Osep, Patrick Dendorfer, Philip Torr, Andreas Geiger, Laura Leal-Taixe, Bastian Leibe
Multi-Object Tracking (MOT) has been notoriously difficult to evaluate.
no code implementations • 22 Apr 2021 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world.
1 code implementation • CVPR 2022 • Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian Leibe
Existing state-of-the-art methods for Video Object Segmentation (VOS) learn low-level pixel-to-pixel correspondences between frames to propagate object masks across video.
no code implementations • CVPR 2022 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
A benchmark that would allow us to perform an apple-to-apple comparison of existing efforts is a crucial first step towards advancing this important research field.
Ranked #3 on Open-World Video Segmentation on BURST-val (using extra training data)
1 code implementation • CVPR 2022 • Neehar Peri, Jonathon Luiten, Mengtian Li, Aljoša Ošep, Laura Leal-Taixé, Deva Ramanan
Object detection and forecasting are fundamental components of embodied perception.
1 code implementation • 1 Jun 2022 • Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian Leibe
Recently, "Masked Attention" was proposed in which a given object representation only attends to those image pixel features for which the segmentation mask of that object is active.
1 code implementation • 25 Sep 2022 • Ali Athar, Jonathon Luiten, Paul Voigtlaender, Tarasha Khurana, Achal Dave, Bastian Leibe, Deva Ramanan
Multiple existing benchmarks involve tracking and segmenting objects in video e. g., Video Object Segmentation (VOS) and Multi-Object Tracking and Segmentation (MOTS), but there is little interaction between them due to the use of disparate benchmark datasets and metrics (e. g. J&F, mAP, sMOTSA).
Ranked #4 on Long-tail Video Object Segmentation on BURST-val (using extra training data)
Long-tail Video Object Segmentation Multi-Object Tracking +6
1 code implementation • CVPR 2023 • Ali Athar, Alexander Hermans, Jonathon Luiten, Deva Ramanan, Bastian Leibe
A single TarViS model can be trained jointly on a collection of datasets spanning different tasks, and can hot-swap between tasks during inference without any task-specific retraining.
Ranked #2 on Video Panoptic Segmentation on KITTI-STEP (using extra training data)
no code implementations • 18 Aug 2023 • Jonathon Luiten, Georgios Kopanas, Bastian Leibe, Deva Ramanan
We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements.
1 code implementation • 4 Dec 2023 • Nikhil Keetha, Jay Karhade, Krishna Murthy Jatavallabhula, Gengshan Yang, Sebastian Scherer, Deva Ramanan, Jonathon Luiten
Dense simultaneous localization and mapping (SLAM) is crucial for robotics and augmented reality applications.