no code implementations • 25 Feb 2025 • Botao Ye, Sifei Liu, Xueting Li, Marc Pollefeys, Ming-Hsuan Yang
Large diffusion models demonstrate remarkable zero-shot capabilities in novel view synthesis from a single image.
no code implementations • 15 Dec 2024 • Mariam Hassan, Sebastian Stapf, Ahmad Rahimi, Pedro M B Rezende, Yasaman Haghighi, David Brüggemann, Isinsu Katircioglu, Lin Zhang, Xiaoran Chen, Suman Saha, Marco Cannici, Elie Aljalbout, Botao Ye, Xi Wang, Aram Davtyan, Mathieu Salzmann, Davide Scaramuzza, Marc Pollefeys, Paolo Favaro, Alexandre Alahi
We present GEM, a Generalizable Ego-vision Multimodal world model that predicts future frames using a reference frame, sparse features, human poses, and ego-trajectories.
1 code implementation • 31 Oct 2024 • Botao Ye, Sifei Liu, Haofei Xu, Xueting Li, Marc Pollefeys, Ming-Hsuan Yang, Songyou Peng
We utilize the reconstructed 3D Gaussians for novel view synthesis and pose estimation tasks and propose a two-stage coarse-to-fine pipeline for accurate pose estimation.
1 code implementation • CVPR 2023 • Botao Ye, Sifei Liu, Xueting Li, Ming-Hsuan Yang
In this work, we introduce a self-supervised super-plane constraint by exploring the free geometry cues from the predicted surface, which can further regularize the reconstruction of plane regions without any other ground truth annotations.
1 code implementation • 22 Mar 2022 • Botao Ye, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
The current popular two-stream, two-stage tracking framework extracts the template and the search region features separately and then performs relation modeling, thus the extracted features lack the awareness of the target and have limited target-background discriminability.
Ranked #4 on
Visual Tracking
on TNL2K
no code implementations • CVPR 2022 • Qing Lian, Botao Ye, Ruijia Xu, Weilong Yao, Tong Zhang
In addition, we demonstrate that the augmentation methods are well suited for semi-supervised training and cross-dataset generalization.