no code implementations • 28 Apr 2024 • Xiaolong Li, Jiawei Mo, Ying Wang, Chethan Parameshwara, Xiaohan Fei, Ashwin Swaminathan, Cj Taylor, Zhuowen Tu, Paolo Favaro, Stefano Soatto
In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model.
no code implementations • 29 Feb 2024 • Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, Cj Taylor, Paolo Favaro, Stefano Soatto
However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D model inaccuracies.
no code implementations • 6 Jun 2023 • Chethan Parameshwara, Alessandro Achille, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, Cj Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto
We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion.
no code implementations • 12 Mar 2018 • Anton Mitrokhin, Cornelia Fermuller, Chethan Parameshwara, Yiannis Aloimonos
Event-based vision sensors, such as the Dynamic Vision Sensor (DVS), are ideally suited for real-time motion analysis.