no code implementations • 24 Jan 2025 • Fei Xue, Sven Elflein, Laura Leal-Taixé, Qunjie Zhou
Extensive experiments validate that MATCHA consistently surpasses state-of-the-art methods across geometric, semantic, and temporal matching tasks, setting a new foundation for a unified approach for the fundamental correspondence problem in computer vision.
no code implementations • 24 Jan 2025 • Sven Elflein, Qunjie Zhou, Sérgio Agostinho, Laura Leal-Taixé
We present Light3R-SfM, a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion (SfM) from unconstrained image collections.
no code implementations • 1 Dec 2024 • Yizhou Wang, Tim Meinhardt, Orcun Cetintas, Cheng-Yen Yang, Sameer Satish Pusegaonkar, Benjamin Missaoui, Sujit Biswas, Zheng Tang, Laura Leal-Taixé
Object perception from multi-view cameras is crucial for intelligent systems, particularly in indoor environments, e. g., warehouses, retail stores, and hospitals.
Ranked #1 on
Multi-Object Tracking
on Wildtrack
(using extra training data)
no code implementations • 3 Sep 2024 • Jenny Seidenschwarz, Qunjie Zhou, Bardienus Duisterhof, Deva Ramanan, Laura Leal-Taixé
We target the challenge of online 2D and 3D point tracking from unposed monocular camera input introducing Dynamic Online Monocular Reconstruction (DynOMo).
no code implementations • 29 Aug 2024 • Linyan Yang, Lukas Hoyer, Mark Weber, Tobias Fischer, Dengxin Dai, Laura Leal-Taixé, Marc Pollefeys, Daniel Cremers, Luc van Gool
Unsupervised Domain Adaptation (UDA) is the task of bridging the domain gap between a labeled source domain, e. g., synthetic data, and an unlabeled target domain.
no code implementations • 17 Apr 2024 • Orcun Cetintas, Tim Meinhardt, Guillem Brasó, Laura Leal-Taixé
Increasing the annotation efficiency of trajectory annotations from videos has the potential to enable the next generation of data-hungry tracking algorithms to thrive on large-scale datasets.
no code implementations • CVPR 2024 • Aysim Toker, Marvin Eisenberger, Daniel Cremers, Laura Leal-Taixé
In recent years, semantic segmentation has become a pivotal tool in processing and interpreting satellite imagery.
1 code implementation • 19 Mar 2024 • Aljoša Ošep, Tim Meinhardt, Francesco Ferroni, Neehar Peri, Deva Ramanan, Laura Leal-Taixé
We propose the SAL (Segment Anything in Lidar) method consisting of a text-promptable zero-shot model for segmenting and classifying any object in Lidar, and a pseudo-labeling engine that facilitates model training without manual supervision.
no code implementations • 14 Mar 2024 • Qunjie Zhou, Maxim Maximov, Or Litany, Laura Leal-Taixé
Significantly, we introduce NeRFMatch, an advanced 2D-3D matching function that capitalizes on the internal knowledge of NeRF learned via view synthesis.
no code implementations • CVPR 2024 • Jenny Seidenschwarz, Aljoša Ošep, Francesco Ferroni, Simon Lucey, Laura Leal-Taixé
Recent results suggest that heuristic-based clustering methods in conjunction with object trackers can be used to pseudo-label instances of moving objects and use these as supervisory signals to train 3D object detectors in Lidar data without manual supervision.
1 code implementation • 19 Oct 2023 • Abhinav Agarwalla, Xuhua Huang, Jason Ziglar, Francesco Ferroni, Laura Leal-Taixé, James Hays, Aljoša Ošep, Deva Ramanan
Our network is modular by design and optimized for all aspects of both the panoptic segmentation and tracking task.
1 code implementation • 16 Sep 2023 • Luca Scofano, Alessio Sampieri, Elisabeth Schiele, Edoardo De Matteis, Laura Leal-Taixé, Fabio Galasso
So far, only Mao et al. NeurIPS'22 have addressed scene-aware global motion, cascading the prediction of future scene contact points and the global motion estimation.
Ranked #2 on
Human Pose Forecasting
on GTA-IM Dataset
no code implementations • 20 Jun 2023 • Maxim Maximov, Tim Meinhardt, Ismail Elezi, Zoe Papakipos, Caner Hazirbas, Cristian Canton Ferrer, Laura Leal-Taixé
To highlight the importance of privacy issues and motivate future research, we motivate and introduce the Pedestrian Dataset De-Identification (PDI) task.
1 code implementation • ICCV 2023 • Cristiano Saltori, Aljoša Ošep, Elisa Ricci, Laura Leal-Taixé
To answer this question, we design the first experimental setup for studying domain generalization (DG) for LiDAR semantic segmentation (DG-LSS).
no code implementations • CVPR 2023 • Marvin Eisenberger, Aysim Toker, Laura Leal-Taixé, Daniel Cremers
We present G-MSM (Graph-based Multi-Shape Matching), a novel unsupervised learning approach for non-rigid shape correspondence.
1 code implementation • CVPR 2023 • Orcun Cetintas, Guillem Brasó, Laura Leal-Taixé
Tracking objects over long videos effectively means solving a spectrum of problems, from short-term association for un-occluded objects to long-term association for objects that are occluded and then reappear in the scene.
Ranked #2 on
Multiple Object Tracking
on BDD100K test
1 code implementation • CVPR 2023 • Yang Liu, Shen Yan, Laura Leal-Taixé, James Hays, Deva Ramanan
We draw inspiration from human visual classification studies and propose generalizing augmentation with invariant transforms to soft augmentation where the learning target softens non-linearly as a function of the degree of the transform applied to the sample: e. g., more aggressive image crop augmentations produce less confident learning targets.
1 code implementation • 19 Oct 2022 • Vladimir Fomenko, Ismail Elezi, Deva Ramanan, Laura Leal-Taixé, Aljoša Ošep
We then train our network to learn to classify each RoI, either as one of the known classes, seen in the source dataset, or one of the novel classes, with a long-tail distribution constraint on the class assignments, reflecting the natural frequency of classes in the real world.
Ranked #2 on
Novel Object Detection
on LVIS v1.0 val
1 code implementation • 14 Oct 2022 • Patrick Dendorfer, Vladimir Yugay, Aljoša Ošep, Laura Leal-Taixé
While we have significantly advanced short-term tracking performance, bridging longer occlusion gaps remains elusive: state-of-the-art object trackers only bridge less than 10% of occlusions longer than three seconds.
no code implementations • 11 Oct 2022 • Peter Kocsis, Peter Súkeník, Guillem Brasó, Matthias Nießner, Laura Leal-Taixé, Ismail Elezi
This allows us to improve the generalization of a CNN-based model without any increase in the number of weights at test time.
no code implementations • 29 Sep 2022 • Mariia Gladkova, Nikita Korobov, Nikolaus Demmel, Aljoša Ošep, Laura Leal-Taixé, Daniel Cremers
Direct methods have shown excellent performance in the applications of visual odometry and SLAM.
no code implementations • 3 Aug 2022 • Aleksandr Kim, Guillem Brasó, Aljoša Ošep, Laura Leal-Taixé
This allows our graph neural network to learn to effectively encode temporal and spatial interactions and fully leverage contextual and motion cues to obtain final scene interpretation by posing data association as edge classification.
1 code implementation • 22 Jul 2022 • Adrià Caelles, Tim Meinhardt, Guillem Brasó, Laura Leal-Taixé
To reason about all VIS subtasks jointly over multiple frames, we present temporal multi-scale deformable attention with instance-aware object queries.
1 code implementation • CVPR 2023 • Jenny Seidenschwarz, Guillem Brasó, Victor Castro Serrano, Ismail Elezi, Laura Leal-Taixé
For association, most models resourced to motion and appearance cues, e. g., re-identification networks.
1 code implementation • CVPR 2022 • Marvin Eisenberger, Aysim Toker, Laura Leal-Taixé, Florian Bernard, Daniel Cremers
The Sinkhorn operator has recently experienced a surge of popularity in computer vision and related fields.
1 code implementation • CVPR 2022 • Neehar Peri, Jonathon Luiten, Mengtian Li, Aljoša Ošep, Laura Leal-Taixé, Deva Ramanan
Object detection and forecasting are fundamental components of embodied perception.
1 code implementation • 24 Mar 2022 • Qunjie Zhou, Sérgio Agostinho, Aljosa Osep, Laura Leal-Taixé
In this paper, we propose to go beyond the well-established approach to vision-based localization that relies on visual descriptor matching between a query image and a 3D point cloud.
1 code implementation • CVPR 2022 • Aysim Toker, Lukas Kondmann, Mark Weber, Marvin Eisenberger, Andrés Camero, Jingliang Hu, Ariadna Pregel Hoderlein, Çağlar Şenaras, Timothy Davis, Daniel Cremers, Giovanni Marchisio, Xiao Xiang Zhu, Laura Leal-Taixé
These observations are paired with pixel-wise monthly semantic segmentation labels of 7 land use and land cover (LULC) classes.
no code implementations • CVPR 2022 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
A benchmark that would allow us to perform an apple-to-apple comparison of existing efforts is a crucial first step towards advancing this important research field.
Ranked #3 on
Open-World Video Segmentation
on BURST-val
(using extra training data)
1 code implementation • ICCV 2021 • Guillem Brasó, Nikita Kister, Laura Leal-Taixé
We introduce CenterGroup, an attention-based framework to estimate human poses from a set of identity-agnostic keypoints and person center predictions in an image.
Ranked #6 on
Multi-Person Pose Estimation
on MS COCO
no code implementations • 5 Oct 2021 • Lukas Kondmann, Aysim Toker, Sudipan Saha, Bernhard Schölkopf, Laura Leal-Taixé, Xiao Xiang Zhu
It uses this model to analyze differences in the pixel and its spatial context-based predictions in subsequent time periods for change detection.
no code implementations • 29 Sep 2021 • Marvin Eisenberger, Aysim Toker, Laura Leal-Taixé, Florian Bernard, Daniel Cremers
Our main contribution is deriving a simple and efficient algorithm that performs this backward pass in closed form.
1 code implementation • ICCV 2021 • Sérgio Agostinho, Aljoša Ošep, Alessio Del Bue, Laura Leal-Taixé
However, given the initial rotation estimate supplied by Kabsch, we show we can improve point correspondence learning during model training by extending the original optimization problem.
1 code implementation • 17 Jun 2021 • Matthijs Douze, Giorgos Tolias, Ed Pizzi, Zoë Papakipos, Lowik Chanussot, Filip Radenovic, Tomas Jenicek, Maxim Maximov, Laura Leal-Taixé, Ismail Elezi, Ondřej Chum, Cristian Canton Ferrer
This benchmark is used for the Image Similarity Challenge at NeurIPS'21 (ISC2021).
Ranked #1 on
Image Similarity Detection
on DISC21 dev
3 code implementations • 29 Apr 2021 • Aleksandr Kim, Aljoša Ošep, Laura Leal-Taixé
Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time.
Ranked #1 on
Multi-Object Tracking and Segmentation
on KITTI MOTS
no code implementations • 22 Apr 2021 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world.
1 code implementation • CVPR 2021 • Aysim Toker, Qunjie Zhou, Maxim Maximov, Laura Leal-Taixé
The goal of cross-view image based geo-localization is to determine the location of a given street view image by matching it against a collection of geo-tagged satellite images.
no code implementations • 8 Mar 2021 • Patrick Wenzel, Torsten Schön, Laura Leal-Taixé, Daniel Cremers
Obstacle avoidance is a fundamental and challenging problem for autonomous navigation of mobile robots.
1 code implementation • CVPR 2021 • Mehmet Aygün, Aljoša Ošep, Mark Weber, Maxim Maximov, Cyrill Stachniss, Jens Behley, Laura Leal-Taixé
In this paper, we propose 4D panoptic LiDAR segmentation to assign a semantic class and a temporally-consistent instance ID to a sequence of 3D points.
Ranked #7 on
4D Panoptic Segmentation
on SemanticKITTI
1 code implementation • 23 Feb 2021 • Mark Weber, Jun Xie, Maxwell Collins, Yukun Zhu, Paul Voigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljoša Ošep, Laura Leal-Taixé, Liang-Chieh Chen
The task of assigning semantic classes and track identities to every pixel in a video is called video panoptic segmentation.
2 code implementations • 15 Feb 2021 • Jenny Seidenschwarz, Ismail Elezi, Laura Leal-Taixé
To this end, we propose an approach based on message passing networks that takes all the relations in a mini-batch into account.
Ranked #3 on
Metric Learning
on CARS196
1 code implementation • NeurIPS 2020 • Marvin Eisenberger, Aysim Toker, Laura Leal-Taixé, Daniel Cremers
We propose a novel unsupervised learning approach to 3D shape correspondence that builds a multiscale matching pipeline into a deep neural network.
no code implementations • 15 Oct 2020 • Patrick Dendorfer, Aljoša Ošep, Anton Milan, Konrad Schindler, Daniel Cremers, Ian Reid, Stefan Roth, Laura Leal-Taixé
We present MOTChallenge, a benchmark for single-camera Multiple Object Tracking (MOT) launched in late 2014, to collect existing and new data, and create a framework for the standardized evaluation of multiple object tracking methods.
2 code implementations • 2 Oct 2020 • Patrick Dendorfer, Aljoša Ošep, Laura Leal-Taixé
Inspired by human navigation, we model the task of trajectory prediction as an intuitive two-stage process: (i) goal estimation, which predicts the most likely target positions of the agent, followed by a (ii) routing module which estimates a set of plausible trajectories that route towards the estimated goal.
1 code implementation • 26 Aug 2020 • Sabarinath Mahadevan, Ali Athar, Aljoša Ošep, Sebastian Hennen, Laura Leal-Taixé, Bastian Leibe
On the other hand, 3D convolutional networks have been successfully applied for video classification tasks, but have not been leveraged as effectively to problems involving dense per-pixel interpretation of videos compared to their 2D convolutional counterparts and lag behind the aforementioned networks in terms of performance.
Ranked #15 on
Unsupervised Video Object Segmentation
on DAVIS 2016 val
1 code implementation • CVPR 2020 • Maxim Maximov, Kevin Galim, Laura Leal-Taixé
We are able to train our model completely on synthetic data and directly apply it to a wide range of real-world images.
Ranked #1 on
Depth Estimation
on NYU-Depth V2
(RMSE metric)
1 code implementation • CVPR 2020 • Maxim Maximov, Ismail Elezi, Laura Leal-Taixé
In many real-world scenarios like people tracking or action recognition, it is important to be able to process the data while taking careful consideration in protecting people's identity.
1 code implementation • L4DC 2020 • Nathanael Bosch, Jan Achterhold, Laura Leal-Taixé, Jörg Stückler
We propose to learn a deep latent Gaussian process dynamics (DLGPD) model that learns low-dimensional system dynamics from environment interactions with visual observations.
1 code implementation • 19 Mar 2020 • Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, Laura Leal-Taixé
The benchmark for Multiple Object Tracking, MOTChallenge, was launched with the goal to establish a standardized evaluation of multiple object tracking methods.
Multi-Object Tracking
Multiple Object Tracking with Transformer
+2
1 code implementation • ECCV 2020 • Ali Athar, Sabarinath Mahadevan, Aljoša Ošep, Laura Leal-Taixé, Bastian Leibe
In this paper, we propose a different approach that is well-suited to a variety of tasks involving instance segmentation in videos.
Ranked #5 on
Unsupervised Video Object Segmentation
on DAVIS 2017 (val)
(using extra training data)
no code implementations • 30 Jan 2020 • Hamid Rezatofighi, Tianyu Zhu, Roman Kaskman, Farbod T. Motlagh, Qinfeng Shi, Anton Milan, Daniel Cremers, Laura Leal-Taixé, Ian Reid
In our formulation we define a likelihood for a set distribution represented by a) two discrete distributions defining the set cardinally and permutation variables, and b) a joint distribution over set elements with a fixed cardinality.
2 code implementations • 16 Dec 2019 • Guillem Brasó, Laura Leal-Taixé
Graphs offer a natural way to formulate Multiple Object Tracking (MOT) within the tracking-by-detection paradigm.
1 code implementation • 11 Dec 2019 • Kishan Sharma, Moritz Gold, Christian Zurbruegg, Laura Leal-Taixé, Jan Dirk Wegner
Our method results in an overall improvement in the count and size distribution prediction as compared to state-of-the-art instance segmentation method Mask R-CNN.
no code implementations • 25 Jul 2019 • Qadeer Khan, Patrick Wenzel, Daniel Cremers, Laura Leal-Taixé
The ability of deep learning models to generalize well across different scenarios depends primarily on the quality and quantity of annotated data.
13 code implementations • 23 Nov 2018 • Mengyu Chu, You Xie, Jonas Mayer, Laura Leal-Taixé, Nils Thuerey
Additionally, we propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution.
Ranked #1 on
Video Super-Resolution
on MSU Video Upscalers: Quality Enhancement
(VMAF metric)
1 code implementation • 3 Jul 2018 • Patrick Wenzel, Qadeer Khan, Daniel Cremers, Laura Leal-Taixé
To this end, we propose to divide the task of vehicle control into two independent modules: a control module which is only trained on one weather condition for which labeled steering data is available, and a perception module which is used as an interface between new weather conditions and the fixed control module.
no code implementations • ICLR 2019 • S. Hamid Rezatofighi, Roman Kaskman, Farbod T. Motlagh, Qinfeng Shi, Daniel Cremers, Laura Leal-Taixé, Ian Reid
We demonstrate the validity of this new formulation on two relevant vision problems: object detection, for which our formulation outperforms state-of-the-art detectors such as Faster R-CNN and YOLO, and a complex CAPTCHA test, where we observe that, surprisingly, our set based network acquired the ability of mimicking arithmetics without any rules being coded.
no code implementations • ICCV 2019 • Maxim Maximov, Laura Leal-Taixé, Mario Fritz, Tobias Ritschel
Second, we demonstrate how another network can be used to map from an image or video frames to a DAM network to reproduce this appearance, without using a lengthy optimization such as stochastic gradient descent (learning-to-learn).
no code implementations • 18 Sep 2017 • Kevis-Kokitsi Maninis, Sergi Caelles, Yu-Hua Chen, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc van Gool
Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames.
Ranked #47 on
Semi-Supervised Video Object Segmentation
on DAVIS 2016
no code implementations • 23 May 2017 • Roberto Henschel, Laura Leal-Taixé, Daniel Cremers, Bodo Rosenhahn
In order to track all persons in a scene, the tracking-by-detection paradigm has proven to be a very effective approach.
Ranked #22 on
Multi-Object Tracking
on MOT16
no code implementations • CVPR 2018 • Emanuel Laude, Jan-Hendrik Lange, Jonas Schüpfer, Csaba Domokos, Laura Leal-Taixé, Frank R. Schmidt, Bjoern Andres, Daniel Cremers
This paper introduces a novel algorithm for transductive inference in higher-order MRFs, where the unary energies are parameterized by a variable classifier.
no code implementations • 10 Apr 2017 • Laura Leal-Taixé, Anton Milan, Konrad Schindler, Daniel Cremers, Ian Reid, Stefan Roth
Standardized benchmarks are crucial for the majority of computer vision applications.
5 code implementations • 4 Apr 2017 • Caner Hazirbas, Sebastian Georg Soyer, Maximilian Christian Staab, Laura Leal-Taixé, Daniel Cremers
Depth from focus (DFF) is one of the classical ill-posed inverse problems in computer vision.
no code implementations • ICCV 2017 • Florian Walch, Caner Hazirbas, Laura Leal-Taixé, Torsten Sattler, Sebastian Hilsenbeck, Daniel Cremers
In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes.
8 code implementations • CVPR 2017 • Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc van Gool
This paper tackles the task of semi-supervised video object segmentation, i. e., the separation of an object from the background in a video, given the mask of the first frame.
no code implementations • 25 Jul 2016 • Roberto Henschel, Laura Leal-Taixé, Bodo Rosenhahn, Konrad Schindler
We present a novel formulation of the multiple object tracking problem which integrates low and mid-level features.
no code implementations • 26 Apr 2016 • Laura Leal-Taixé, Cristian Canton Ferrer, Konrad Schindler
This paper introduces a novel approach to the task of data association within the context of pedestrian tracking, by introducing a two-stage learning scheme to match pairs of detections.
2 code implementations • 8 Apr 2015 • Laura Leal-Taixé, Anton Milan, Ian Reid, Stefan Roth, Konrad Schindler
We discuss the challenges of creating such a framework, collecting existing and new data, gathering state-of-the-art methods to be tested on the datasets, and finally creating a unified evaluation system.
no code implementations • 24 Nov 2014 • Laura Leal-Taixé
Multiple people tracking is a key problem for many applications such as surveillance, animation or car navigation, and a key input for tasks such as activity recognition.