1 code implementation • 4 Mar 2024 • Xuweiyi Chen, Tian Xia, Sihan Xu
Video Diffusion Models have been developed for video generation, usually integrating text and image conditioning to enhance control over the generated content.
1 code implementation • 21 Sep 2023 • Jianing Yang, Xuweiyi Chen, Shengyi Qian, Nikhil Madaan, Madhavan Iyengar, David F. Fouhey, Joyce Chai
While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline.