We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in inverse RMSE with dense scale alignment relative to performing just global alignment alone.
Semantic segmentation models struggle to generalize in the presence of domain shift.
We present LSeg, a novel model for language-driven semantic image segmentation.
Ranked #1 on Few-Shot Semantic Segmentation on FSS-1000
Indeed, the subtasks are executed sequentially, leading to increased processing latency and a compounding of errors through the pipeline.
We introduce dense vision transformers, an architecture that leverages vision transformers in place of convolutional networks as a backbone for dense prediction tasks.
Ranked #12 on Semantic Segmentation on PASCAL Context
In this paper, we propose to learn a sensorimotor policy that enables an autonomous quadrotor to fly extreme acrobatic maneuvers with only onboard sensing and computation.
In particular, we propose a robust training objective that is invariant to changes in depth range and scale, advocate the use of principled multi-objective learning to combine data from different sources, and highlight the importance of pretraining encoders on auxiliary tasks.
Ranked #2 on Depth Estimation on eBDtheque
In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors.
Convolutional networks for single-view object reconstruction have shown impressive performance and have become a popular subject of research.
Ranked #1 on 3D Reconstruction on 300W
Since the output of event cameras is fundamentally different from conventional cameras, it is commonly accepted that they require the development of specialized algorithms to accommodate the particular nature of events.
The Fields of Experts (FoE) image prior model, a filter-based higher-order Markov Random Fields (MRF) model, has been shown to be effective for many image restoration problems.
Inpainting based image compression approaches, especially linear and non-linear diffusion models, are an active research topic for lossy image compression.
It is now well known that Markov random fields (MRFs) are particularly effective for modeling image priors in low-level vision.
Numerical experiments show that our trained models clearly outperform existing analysis operator learning approaches and are on par with state-of-the-art image denoising algorithms.