no code implementations • 28 Apr 2024 • Xiaolong Li, Jiawei Mo, Ying Wang, Chethan Parameshwara, Xiaohan Fei, Ashwin Swaminathan, Cj Taylor, Zhuowen Tu, Paolo Favaro, Stefano Soatto
In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model.
no code implementations • 29 Feb 2024 • Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, Cj Taylor, Paolo Favaro, Stefano Soatto
However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D model inaccuracies.
no code implementations • 6 Jun 2023 • Chethan Parameshwara, Alessandro Achille, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, Cj Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto
We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion.
no code implementations • 4 Jul 2019 • Son Tran, Ming Du, Sampath Chanda, R. Manmatha, Cj Taylor
In particular, Instagram and Twitter influencers often provide images of themselves wearing different outfits and their followers are often inspired to buy similar clothes. We propose a system to automatically find the closest visually similar clothes in the online Catalog (street-to-shop searching).