no code implementations • NAACL (TextGraphs) 2021 • Jinman Zhao, Gerald Penn, Huan Ling
In this paper, we define an abstract task called structural realization that generates words given a prefix of words and a partial representation of a parse tree.
no code implementations • 14 Jun 2024 • Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling
We present L4GM, the first 4D Large Reconstruction Model that produces animated objects from a single-view video input -- in a single feed-forward pass that takes only a second.
no code implementations • CVPR 2024 • Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis
We also propose a motion amplification mechanism as well as a new autoregressive synthesis scheme to generate and combine multiple 4D sequences for longer generation.
no code implementations • CVPR 2024 • Chenfeng Xu, Huan Ling, Sanja Fidler, Or Litany
However, these features are initially trained on paired text and image data, which are not optimized for 3D tasks, and often exhibit a domain gap when applied to the target data.
no code implementations • ICCV 2023 • Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler
In this work, we introduce a self-supervised feature representation learning framework DreamTeacher that utilizes generative networks for pre-training downstream image backbones.
3 code implementations • CVPR 2023 • Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis
We first pre-train an LDM on images only; then, we turn the image generator into a video generator by introducing a temporal dimension to the latent space diffusion model and fine-tuning on encoded image sequences, i. e., videos.
Ranked #5 on Text-to-Video Generation on MSR-VTT (CLIP-FID metric)
no code implementations • CVPR 2022 • Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Kreis, Adela Barriuso, Sanja Fidler, Antonio Torralba
By training an effective feature segmentation architecture on top of BigGAN, we turn BigGAN into a labeled dataset generator.
1 code implementation • NeurIPS 2021 • Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler
EditGAN builds on a GAN framework that jointly models images and their semantic segmentations, requiring only a handful of labeled examples, making it a scalable tool for editing.
2 code implementations • CVPR 2021 • Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler
To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts.
no code implementations • NeurIPS 2020 • Huan Ling, David Acuna, Karsten Kreis, Seung Wook Kim, Sanja Fidler
In images of complex scenes, objects are often occluding each other which makes perception tasks such as object detection and tracking, or robotic control tasks such as planning, challenging.
no code implementations • ICLR 2021 • Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler
Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a teacher to disentangle the GAN's latent code into interpretable 3D properties.
no code implementations • ECCV 2020 • Bo-Wen Chen, Huan Ling, Xiaohui Zeng, Gao Jun, Ziyue Xu, Sanja Fidler
Our approach tolerates a modest amount of noise in the box placements, thus typically only a few clicks are needed to annotate tracked boxes to a sufficient accuracy.
1 code implementation • NeurIPS 2019 • Wenzheng Chen, Jun Gao, Huan Ling, Edward J. Smith, Jaakko Lehtinen, Alec Jacobson, Sanja Fidler
Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering.
Ranked #4 on Single-View 3D Reconstruction on ShapeNet
2 code implementations • CVPR 2019 • Huan Ling, Jun Gao, Amlan Kar, Wenzheng Chen, Sanja Fidler
Our model runs at 29. 3ms in automatic, and 2. 6ms in interactive mode, making it 10x and 100x faster than Polygon-RNN++.
no code implementations • ACL 2018 • Avishek Joey Bose, Huan Ling, Yanshuai Cao
Learning by contrasting positive and negative samples is a general strategy adopted by many methods.
3 code implementations • CVPR 2018 • David Acuna, Huan Ling, Amlan Kar, Sanja Fidler
Manually labeling datasets with object masks is extremely time consuming.
no code implementations • NeurIPS 2017 • Huan Ling, Sanja Fidler
Robots will eventually be part of every household.
no code implementations • 1 Jun 2017 • Huan Ling, Sanja Fidler
Robots will eventually be part of every household.