no code implementations • 17 Oct 2024 • Ruoshi Liu, Alper Canberk, Shuran Song, Carl Vondrick
Vision foundation models trained on massive amounts of visual data have shown unprecedented reasoning and planning skills in open-world settings.
no code implementations • 31 Aug 2024 • Alper Canberk, Maksym Bondarenko, Ege Ozguroglu, Ruoshi Liu, Carl Vondrick
With this scalable automatic data generation pipeline, we can create a dataset for learning object insertion, which is used to train our proposed text conditioned diffusion model.
no code implementations • 13 Aug 2024 • Sruthi Sudhakar, Ruoshi Liu, Basile Van Hoorick, Carl Vondrick, Richard Zemel
Humans naturally build mental models of object interactions and dynamics, allowing them to imagine how their surroundings will change if they take a certain action.
no code implementations • 24 Jun 2024 • Junbang Liang, Ruoshi Liu, Ege Ozguroglu, Sruthi Sudhakar, Achal Dave, Pavel Tokmakov, Shuran Song, Carl Vondrick
A key challenge in manipulation is learning a policy that can robustly generalize to diverse visual environments.
no code implementations • 23 May 2024 • Basile Van Hoorick, Rundi Wu, Ege Ozguroglu, Kyle Sargent, Ruoshi Liu, Pavel Tokmakov, Achal Dave, Changxi Zheng, Carl Vondrick
Accurate reconstruction of complex dynamic scenes from just a single viewpoint continues to be a challenging task in computer vision.
1 code implementation • 15 Feb 2024 • Abdullah Hamdi, Luke Melas-Kyriazi, Jinjie Mai, Guocheng Qian, Ruoshi Liu, Carl Vondrick, Bernard Ghanem, Andrea Vedaldi
With the aid of a frequency-modulated loss, GES achieves competitive performance in novel-view synthesis benchmarks while requiring less than half the memory storage of Gaussian Splatting and increasing the rendering speed by up to 39%.
1 code implementation • CVPR 2024 • Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick
We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.
no code implementations • CVPR 2024 • Abdullah Hamdi, Luke Melas-Kyriazi, Jinjie Mai, Guocheng Qian, Ruoshi Liu, Carl Vondrick, Bernard Ghanem, Andrea Vedaldi
With the aid of a frequency-modulated loss GES achieves competitive performance in novel-view synthesis benchmarks while requiring less than half the memory storage of Gaussian Splatting and increasing the rendering speed by up to 39%.
1 code implementation • 24 May 2023 • Rundi Wu, Ruoshi Liu, Carl Vondrick, Changxi Zheng
Specifically, we encode the input 3D textured shape into triplane feature maps that represent the signed distance and texture fields of the input.
no code implementations • CVPR 2023 • Ruoshi Liu, Carl Vondrick
The relatively hot temperature of the human body causes people to turn into long-wave infrared light sources.
1 code implementation • ICCV 2023 • Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, Carl Vondrick
We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image.
no code implementations • CVPR 2023 • Ruoshi Liu, Sachit Menon, Chengzhi Mao, Dennis Park, Simon Stent, Carl Vondrick
Experiments and visualizations show that the method is able to generate multiple possible solutions that are consistent with the observation of the shadow.
no code implementations • ICCV 2023 • Ruoshi Liu, Chengzhi Mao, Purva Tendulkar, Hao Wang, Carl Vondrick
Many machine learning methods operate by inverting a neural network at inference time, which has become a popular technique for solving inverse problems in computer vision, robotics, and graphics.
no code implementations • 17 Jun 2022 • Ruoshi Liu, Sachit Menon, Chengzhi Mao, Dennis Park, Simon Stent, Carl Vondrick
Experiments and visualizations show that the method is able to generate multiple possible solutions that are consistent with the observation of the shadow.
1 code implementation • CVPR 2021 • Dídac Surís, Ruoshi Liu, Carl Vondrick
We introduce a framework for learning from unlabeled video what is predictable in the future.
Representation Learning Self-Supervised Action Recognition +1
no code implementations • ICLR 2019 • Ruoshi Liu, Michael M. Norton, Seth Fraden, Pengyu Hong
Active matter consists of active agents which transform energy extracted from surroundings into momentum, producing a variety of collective phenomena.