Search Results for author: Huan Ling

Found 16 papers, 6 papers with code

Structural Realization with GGNNs

no code implementations NAACL (TextGraphs) 2021 Jinman Zhao, Gerald Penn, Huan Ling

In this paper, we define an abstract task called structural realization that generates words given a prefix of words and a partial representation of a parse tree.

Language Modelling

3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features

no code implementations7 Nov 2023 Chenfeng Xu, Huan Ling, Sanja Fidler, Or Litany

However, these features are initially trained on paired text and image data, which are not optimized for 3D tasks, and often exhibit a domain gap when applied to the target data.

3D Object Detection Novel View Synthesis +1

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

no code implementations ICCV 2023 Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler

In this work, we introduce a self-supervised feature representation learning framework DreamTeacher that utilizes generative networks for pre-training downstream image backbones.

Knowledge Distillation Representation Learning

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

2 code implementations CVPR 2023 Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

We first pre-train an LDM on images only; then, we turn the image generator into a video generator by introducing a temporal dimension to the latent space diffusion model and fine-tuning on encoded image sequences, i. e., videos.

Ranked #5 on Text-to-Video Generation on MSR-VTT (CLIPSIM metric)

Image Generation Text-to-Video Generation +3

BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations

no code implementations CVPR 2022 Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Kreis, Adela Barriuso, Sanja Fidler, Antonio Torralba

By training an effective feature segmentation architecture on top of BigGAN, we turn BigGAN into a labeled dataset generator.


EditGAN: High-Precision Semantic Image Editing

1 code implementation NeurIPS 2021 Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler

EditGAN builds on a GAN framework that jointly models images and their semantic segmentations, requiring only a handful of labeled examples, making it a scalable tool for editing.

Segmentation Semantic Segmentation +1

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

2 code implementations CVPR 2021 Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler

To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts.

Image Segmentation Semantic Segmentation

Variational Amodal Object Completion

no code implementations NeurIPS 2020 Huan Ling, David Acuna, Karsten Kreis, Seung Wook Kim, Sanja Fidler

In images of complex scenes, objects are often occluding each other which makes perception tasks such as object detection and tracking, or robotic control tasks such as planning, challenging.

object-detection Object Detection

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

no code implementations ICLR 2021 Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler

Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a teacher to disentangle the GAN's latent code into interpretable 3D properties.

Neural Rendering

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation

no code implementations ECCV 2020 Bo-Wen Chen, Huan Ling, Xiaohui Zeng, Gao Jun, Ziyue Xu, Sanja Fidler

Our approach tolerates a modest amount of noise in the box placements, thus typically only a few clicks are needed to annotate tracked boxes to a sufficient accuracy.

Segmentation Semantic Segmentation +2

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

1 code implementation NeurIPS 2019 Wenzheng Chen, Jun Gao, Huan Ling, Edward J. Smith, Jaakko Lehtinen, Alec Jacobson, Sanja Fidler

Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering.

Single-View 3D Reconstruction

Fast Interactive Object Annotation with Curve-GCN

2 code implementations CVPR 2019 Huan Ling, Jun Gao, Amlan Kar, Wenzheng Chen, Sanja Fidler

Our model runs at 29. 3ms in automatic, and 2. 6ms in interactive mode, making it 10x and 100x faster than Polygon-RNN++.

Cannot find the paper you are looking for? You can Submit a new open access paper.