no code implementations • 24 Jul 2024 • Yiming Xie, Chun-Han Yao, Vikram Voleti, Huaizu Jiang, Varun Jampani
We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-view consistent dynamic 3D content generation.
no code implementations • 28 Jun 2024 • Hieu T. Nguyen, YiWen Chen, Vikram Voleti, Varun Jampani, Huaizu Jiang
The global floorplan and attention design in the diffusion model ensures the consistency of the generated images, from which a 3D scene can be reconstructed.
no code implementations • 18 Mar 2024 • Vikram Voleti, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitry Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani
In this work, we propose SV3D that adapts image-to-video diffusion model for novel multi-view synthesis and 3D generation, thereby leveraging the generalization and multi-view consistency of the video models, while further adding explicit camera control for NVS.
2 code implementations • None 2023 • Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram Voleti, Adam Letts, Varun Jampani, Robin Rombach
We then explore the impact of finetuning our base model on high-quality data and train a text-to-video model that is competitive with closed-source video generation.
no code implementations • 19 Oct 2023 • Vikram Voleti
Overall, our research aims to make a meaningful contribution to the pursuit of more efficient and flexible generative models, with the potential to shape the future of computer vision.
1 code implementation • NeurIPS 2023 • Benno Krojer, Elinor Poole-Dayan, Vikram Voleti, Christopher Pal, Siva Reddy
We also measure the stereotypical bias in diffusion models, and find that Stable Diffusion 2. 1 is, for the most part, less biased than Stable Diffusion 1. 5.
no code implementations • 14 Feb 2023 • Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar
They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.
no code implementations • 18 Dec 2022 • Daniel Zhang, Vikram Voleti, Alexander Wong, Jason Deglint
In this study, we explore the feasibility of leveraging federated learning for privacy-preserving training of deep neural networks for phytoplankton classification.
no code implementations • 21 Oct 2022 • Vikram Voleti, Christopher Pal, Adam Oberman
Generative models based on denoising diffusion techniques have led to an unprecedented increase in the quality and diversity of imagery that is now possible to create with neural generative models.
1 code implementation • 16 Aug 2022 • Vikram Voleti, Boris N. Oreshkin, Florent Bocquelet, Félix G. Harvey, Louis-Simon Ménard, Christopher Pal
Inverse Kinematics (IK) systems are often rigid with respect to their input character, thus requiring user intervention to be adapted to new skeletons.
no code implementations • 3 Aug 2022 • Nitpreet Bamra, Vikram Voleti, Alexander Wong, Jason Deglint
Thus, this work demonstrates the ability of GANs to create large synthetic datasets of phytoplankton from small training datasets, accomplishing a key step towards sustainable systematic monitoring of harmful algal blooms.
1 code implementation • 19 May 2022 • Vikram Voleti, Alexia Jolicoeur-Martineau, Christopher Pal
We train the model in a manner where we randomly and independently mask all the past frames or all the future frames.
Ranked #4 on Video Generation on BAIR Robot Pushing
no code implementations • 22 Dec 2021 • Mahta Ramezanian Panahi, Germán Abrevaya, Jean-Christophe Gagnon-Audet, Vikram Voleti, Irina Rish, Guillaume Dumas
The principled design and discovery of biologically- and physically-informed models of neuronal dynamics has been advancing since the mid-twentieth century.
no code implementations • 7 Sep 2021 • David Kanaa, Vikram Voleti, Samira Ebrahimi Kahou, Christopher Pal
Despite having been studied to a great extent, the task of conditional generation of sequences of frames, or videos, remains extremely challenging.
no code implementations • 24 Jun 2021 • Ju An Park, Vikram Voleti, Kathryn E. Thomas, Alexander Wong, Jason L. Deglint
Warming oceans due to climate change are leading to increased numbers of ectoparasitic copepods, also known as sea lice, which can cause significant ecological loss to wild salmon populations and major economic loss to aquaculture sites.
1 code implementation • 15 Jun 2021 • Vikram Voleti, Chris Finlay, Adam Oberman, Christopher Pal
In this work we introduce a Multi-Resolution variant of such models (MRCNF), by characterizing the conditional distribution over the additional information required to generate a fine image that is consistent with the coarse image.
Ranked #9 on Image Generation on ImageNet 64x64 (Bits per dim metric)
no code implementations • 7 Jun 2021 • Tiago Salvador, Vikram Voleti, Alexander Iannantuono, Adam Oberman
While the primary goal is to improve accuracy under distribution shift, an important secondary goal is uncertainty estimation: evaluating the probability that the prediction of a model is correct.
no code implementations • ICLR 2022 • Tiago Salvador, Stephanie Cairns, Vikram Voleti, Noah Marshall, Adam Oberman
However, they still have drawbacks: they reduce accuracy (AGENDA, PASS, FTC), or require retuning for different false positive rates (FSN).
no code implementations • ICML Workshop INNF 2021 • Vikram Voleti, Chris Finlay, Adam M Oberman, Christopher Pal
Recent work has shown that Continuous Normalizing Flows (CNFs) can serve as generative models of images with exact likelihood calculation and invertible generation/density estimation.
no code implementations • ICLR 2021 • Krishna Murthy Jatavallabhula, Miles Macklin, Florian Golemo, Vikram Voleti, Linda Petrini, Martin Weiss, Breandan Considine, Jerome Parent-Levesque, Kevin Xie, Kenny Erleben, Liam Paull, Florian Shkurti, Derek Nowrouzezahrai, Sanja Fidler
We consider the problem of estimating an object's physical properties such as mass, friction, and elasticity directly from video sequences.
no code implementations • 1 Mar 2021 • Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices.
1 code implementation • ICML 2020 • Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio
To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow.
no code implementations • 31 Jul 2019 • Vincent Michalski, Vikram Voleti, Samira Ebrahimi Kahou, Anthony Ortiz, Pascal Vincent, Chris Pal, Doina Precup
Batch normalization has been widely used to improve optimization in deep neural networks.