1 code implementation • 16 May 2024 • Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu
Hence, with this paper, we aim to chart a course for future research that explores and expands the capabilities of 3D-LLMs in understanding and interacting with the complex 3D world.
1 code implementation • 2 May 2024 • Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam
The scheme ensures that the denoising processes are influenced by a holistic understanding of the scene graph, facilitating the generation of globally coherent scenes.
1 code implementation • CVPR 2024 • Dave Zhenyu Chen, Haoxuan Li, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner
We propose SceneTex, a novel method for effectively generating high-quality and style-consistent textures for indoor scenes using depth-to-image diffusion priors.
1 code implementation • 30 Oct 2023 • Mohammed Munzer Dwedari, Matthias Niessner, Dave Zhenyu Chen
3D question answering is a young field in 3D vision-language that is yet to be explored.
no code implementations • ICCV 2023 • Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, Matthias Nießner
We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts.
no code implementations • ICCV 2023 • Dave Zhenyu Chen, Ronghang Hu, Xinlei Chen, Matthias Nießner, Angel X. Chang
Performing 3D dense captioning and visual grounding requires a common and shared understanding of the underlying multimodal relationships.
2 code implementations • 24 Aug 2022 • Rui Song, Dai Liu, Dave Zhenyu Chen, Andreas Festag, Carsten Trinitis, Martin Schulz, Alois Knoll
In federated learning, all networked clients contribute to the model training cooperatively.
no code implementations • 2 Dec 2021 • Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, Angel X. Chang
Our D3Net unifies dense captioning and visual grounding in 3D in a self-critical manner.
no code implementations • CVPR 2021 • Dave Zhenyu Chen, Ali Gholami, Matthias Nießner, Angel X. Chang
We introduce the task of dense captioning in 3D scans from commodity RGB-D sensors.
3 code implementations • ECCV 2020 • Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner
We introduce the task of 3D object localization in RGB-D scans using natural language descriptions.