no code implementations • 13 Feb 2025 • Pengsheng Guo, Alexander G. Schwing
We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity vector-fields.
1 code implementation • 28 Oct 2024 • Jie An, De Wang, Pengsheng Guo, Jiebo Luo, Alexander Schwing
Furthermore, we empirically find that both the placement and the effective attention size of these local attention windows are crucial factors.
no code implementations • 2 Dec 2023 • Pengsheng Guo, Hans Hao, Adam Caccavale, Zhongzheng Ren, Edward Zhang, Qi Shan, Aditya Sankar, Alexander G. Schwing, Alex Colburn, Fangchang Ma
Our analysis identifies the core of these challenges as the interaction among noise levels in the 2D diffusion process, the architecture of the diffusion network, and the 3D model representation.
no code implementations • ICCV 2023 • Ziyue Feng, Liang Yang, Pengsheng Guo, Bing Li
Recent advances in neural reconstruction using posed image sequences have made remarkable progress.
1 code implementation • 27 Jul 2022 • Miguel Angel Bautista, Pengsheng Guo, Samira Abnar, Walter Talbott, Alexander Toshev, Zhuoyuan Chen, Laurent Dinh, Shuangfei Zhai, Hanlin Goh, Daniel Ulbricht, Afshin Dehghan, Josh Susskind
We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera.
Ranked #1 on
Image Generation
on ARKitScenes
no code implementations • 12 Jul 2021 • Pengsheng Guo, Miguel Angel Bautista, Alex Colburn, Liang Yang, Daniel Ulbricht, Joshua M. Susskind, Qi Shan
We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects.
no code implementations • CVPR 2021 • Chen Huang, Shuangfei Zhai, Pengsheng Guo, Josh Susskind
This leads to consistent improvements since the value function provides effective metric supervision during finetuning, and helps to correct the potential bias of loss-only supervision.
no code implementations • ICML 2020 • Pengsheng Guo, Chen-Yu Lee, Daniel Ulbricht
Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network.
no code implementations • 15 Mar 2019 • Xingyu Lin, Pengsheng Guo, Carlos Florensa, David Held
Robots that are trained to perform a task in a fixed environment often fail when facing unexpected changes to the environment due to a lack of exploration.