no code implementations • 8 Mar 2025 • Anh Thai, Songyou Peng, Kyle Genova, Leonidas Guibas, Thomas Funkhouser
Language-guided 3D scene understanding is important for advancing applications in robotics, AR/VR, and human-computer interaction, enabling models to comprehend and interact with 3D environments through natural language.
1 code implementation • 18 Dec 2024 • Haotong Lin, Sida Peng, Jingxiao Chen, Songyou Peng, Jiaming Sun, Minghuan Liu, Hujun Bao, Jiashi Feng, Xiaowei Zhou, Bingyi Kang
Prompts play a critical role in unleashing the power of language and vision foundation models for specific tasks.
1 code implementation • 31 Oct 2024 • Botao Ye, Sifei Liu, Haofei Xu, Xueting Li, Marc Pollefeys, Ming-Hsuan Yang, Songyou Peng
We utilize the reconstructed 3D Gaussians for novel view synthesis and pose estimation tasks and propose a two-stage coarse-to-fine pipeline for accurate pose estimation.
2 code implementations • 17 Oct 2024 • Haofei Xu, Songyou Peng, Fangjinhua Wang, Hermann Blum, Daniel Barath, Andreas Geiger, Marc Pollefeys
Gaussian splatting and single/multi-view depth estimation are typically studied in isolation.
1 code implementation • 11 Jul 2024 • Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, Torsten Sattler
While the field of 3D scene reconstruction is dominated by NeRFs due to their photorealistic quality, 3D Gaussian Splatting (3DGS) has recently emerged, offering similar quality with real-time rendering speeds.
no code implementations • 30 May 2024 • Gonca Yilmaz, Songyou Peng, Marc Pollefeys, Francis Engelmann, Hermann Blum
However, this flexibility comes with a trade-off: fully-supervised closed-set methods still outperform OVS methods on base classes, that is on classes on which they have been explicitly trained.
Ranked #5 on
Open Vocabulary Semantic Segmentation
on ADE20K-150
3D Instance Segmentation
3D Open-Vocabulary Instance Segmentation
+8
1 code implementation • CVPR 2024 • Weining Ren, Zihan Zhu, Boyang Sun, Jiaqi Chen, Marc Pollefeys, Songyou Peng
Neural Radiance Fields (NeRFs) have shown remarkable success in synthesizing photorealistic views from multi-view images of static scenes, but face challenges in dynamic, real-world environments with distractors like moving objects, shadows, and lighting changes.
no code implementations • CVPR 2024 • Lei LI, Songyou Peng, Zehao Yu, Shaohui Liu, Rémi Pautrat, Xiaochuan Yin, Marc Pollefeys
Real-world objects and environments are predominantly composed of edge features, including straight lines and curves.
1 code implementation • 16 May 2024 • Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu
Hence, with this paper, we aim to chart a course for future research that explores and expands the capabilities of 3D-LLMs in understanding and interacting with the complex 3D world.
no code implementations • 2 May 2024 • Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang
Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields.
no code implementations • 14 Mar 2024 • Haiwen Huang, Songyou Peng, Dan Zhang, Andreas Geiger
Names are essential to both human cognition and vision-language models.
no code implementations • 23 Feb 2024 • Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, Marc Pollefeys, Leonidas Guibas, Hongbo Tian, Chunjie Wang, Xiaosheng Yan, Bingwen Wang, Xuanyang Zhang, Xiao Liu, Phuc Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham, Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023.
no code implementations • 28 Dec 2023 • Rui Huang, Songyou Peng, Ayca Takmaz, Federico Tombari, Marc Pollefeys, Shiji Song, Gao Huang, Francis Engelmann
Therefore, we explore the use of image segmentation foundation models to automatically generate training labels for 3D segmentation.
no code implementations • 20 Dec 2023 • Junru Lin, Asen Nachkov, Songyou Peng, Luc van Gool, Danda Pani Paudel
In this work, we address the challenge of deploying Neural Radiance Field (NeRFs) in Simultaneous Localization and Mapping (SLAM) under the condition of lacking depth information, relying solely on RGB inputs.
1 code implementation • 10 Nov 2023 • Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng, Yandong Wen, Michael J. Black, Adrian Weller, Bernhard Schölkopf
We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT).
no code implementations • 7 Feb 2023 • Zihan Zhu, Songyou Peng, Viktor Larsson, Zhaopeng Cui, Martin R. Oswald, Andreas Geiger, Marc Pollefeys
Neural implicit representations have recently become popular in simultaneous localization and mapping (SLAM), especially in dense visual SLAM.
1 code implementation • CVPR 2023 • Songyou Peng, Kyle Genova, Chiyu "Max" Jiang, Andrea Tagliasacchi, Marc Pollefeys, Thomas Funkhouser
Traditional 3D scene understanding approaches rely on labeled 3D datasets to train a model for a single task with supervision.
Ranked #7 on
3D Open-Vocabulary Instance Segmentation
on Replica
3D Open-Vocabulary Instance Segmentation
3D Semantic Segmentation
+1
no code implementations • 26 Nov 2022 • Lixiang Lin, Songyou Peng, Qijun Gan, Jianke Zhu
We propose an approach for optimizing high-quality clothed human body shapes in minutes, using multi-view posed images.
no code implementations • ICCV 2023 • Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc van Gool, Gordon Wetzstein
Scene extrapolation -- the idea of generating novel views by flying into a given image -- is a promising, yet challenging task.
Ranked #1 on
Perpetual View Generation
on LHQ
1 code implementation • 7 Sep 2022 • Lei LI, Zhizheng Liu, Weining Ren, Liudi Yang, Fangjinhua Wang, Marc Pollefeys, Songyou Peng
3D textured shape recovery from partial scans is crucial for many real-world applications.
1 code implementation • 1 Jun 2022 • Zehao Yu, Songyou Peng, Michael Niemeyer, Torsten Sattler, Andreas Geiger
Motivated by recent advances in the area of monocular geometry prediction, we systematically explore the utility these cues provide for improving neural implicit surface reconstruction.
1 code implementation • CVPR 2022 • Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, Marc Pollefeys
Neural implicit representations have recently shown encouraging results in various domains, including promising progress in simultaneous localization and mapping (SLAM).
2 code implementations • NeurIPS 2021 • Songyou Peng, Chiyu "Max" Jiang, Yiyi Liao, Michael Niemeyer, Marc Pollefeys, Andreas Geiger
However, the implicit nature of neural implicit representations results in slow inference time and requires careful initialization.
2 code implementations • ICCV 2021 • Michael Oechsle, Songyou Peng, Andreas Geiger
At the same time, neural radiance fields have revolutionized novel view synthesis.
4 code implementations • ICCV 2021 • Christian Reiser, Songyou Peng, Yiyi Liao, Andreas Geiger
NeRF synthesizes novel views of a scene with unprecedented quality by fitting a neural radiance field to RGB images.
1 code implementation • 11 Nov 2020 • Stefan Lionar, Daniil Emtsev, Dusan Svilarkovic, Songyou Peng
To further exploit translational equivariance, convolutional neural networks are applied to process the plane features.
Ranked #4 on
3D Reconstruction
on ShapeNet
6 code implementations • ECCV 2020 • Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger
Recently, implicit neural representations have gained popularity for learning-based 3D reconstruction.
1 code implementation • CVPR 2020 • Shaohui Liu, yinda zhang, Songyou Peng, Boxin Shi, Marc Pollefeys, Zhaopeng Cui
We propose a differentiable sphere tracing algorithm to bridge the gap between inverse graphics methods and the recently proposed deep learning based implicit signed distance function.
no code implementations • 7 May 2019 • Xiaoman Zhang, Ziyuan Zhao, Cen Chen, Songyou Peng, Min Wu, Zhongyao Cheng, Singee Teo, Le Zhang, Zeng Zeng
In this study, we applied powerful deep neural network and explored a process in the forecast of skeletal bone age with the specifically combine joints images to increase the performance accuracy compared with the whole hand images.
1 code implementation • 12 Mar 2019 • Ziyuan Zhao, Xiaoman Zhang, Cen Chen, Wei Li, Songyou Peng, Jie Wang, Xulei Yang, Le Zhang, Zeng Zeng
Segmentation stands at the forefront of many high-level vision tasks.
1 code implementation • 2 Mar 2019 • Paola Ardón, Kaisar Kushibar, Songyou Peng
Providing robust solutions for the tasks such as indoor environment mapping, self-localisation and object recognition are essential to make the robots to be more autonomous, hence, more human-like.
Robotics
1 code implementation • 21 Nov 2018 • Le Zhang, Songyou Peng, Stefan Winkler
Apparent personality and emotion analysis are both central to affective computing.
1 code implementation • ICCV 2019 • Songyou Peng, Peter Sturm
It is well known that the accuracy of a calibration depends strongly on the choice of camera poses from which images of a calibration object are acquired.
1 code implementation • 26 Sep 2018 • Bjoern Haefner, Songyou Peng, Alok Verma, Yvain Quéau, Daniel Cremers
This study explores the use of photometric techniques (shape-from-shading and uncalibrated photometric stereo) for upsampling the low-resolution depth map from an RGB-D sensor to the higher resolution of the companion RGB image.
1 code implementation • 2 May 2018 • Songyou Peng, Le Zhang, Yutong Ban, Meng Fang, Stefan Winkler
In this paper, we comprehensively describe the methodology of our submissions to the One-Minute Gradual-Emotion Behavior Challenge 2018.
1 code implementation • 1 Aug 2017 • Songyou Peng, Bjoern Haefner, Yvain Quéau, Daniel Cremers
A novel depth super-resolution approach for RGB-D sensors is presented.