no code implementations • 6 Nov 2023 • Gabriela Ben Melech Stan, Diana Wofk, Estelle Aflalo, Shao-Yen Tseng, Zhipeng Cai, Michael Paulitsch, Vasudev Lal
Our models are fine-tuned from existing pretrained models on datasets containing panoramic/high-resolution RGB images, depth maps and captions.
2 code implementations • 26 Jul 2023 • Reiner Birkl, Diana Wofk, Matthias Müller
We release MiDaS v3. 1 for monocular depth estimation, offering a variety of new models based on different encoder backbones.
2 code implementations • 18 May 2023 • Gabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, Jean Yu, Estelle Aflalo, Shao-Yen Tseng, Fabio Nonato, Matthias Muller, Vasudev Lal
This research paper proposes a Latent Diffusion Model for 3D (LDM3D) that generates both image and depth map data from a given text prompt, allowing users to generate RGBD images from text prompts.
1 code implementation • 21 Mar 2023 • Diana Wofk, René Ranftl, Matthias Müller, Vladlen Koltun
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in inverse RMSE with dense scale alignment relative to performing just global alignment alone.
3 code implementations • 23 Feb 2023 • Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, Matthias Müller
Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains.
Ranked #13 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)
1 code implementation • 8 Mar 2019 • Diana Wofk, Fangchang Ma, Tien-Ju Yang, Sertac Karaman, Vivienne Sze
In this paper, we address the problem of fast depth estimation on embedded systems.