Search Results for author: Jia-Bin Huang

Found 100 papers, 47 papers with code

Shuffle and Attend: Video Domain Adaptation

no code implementations • ECCV 2020 • Jinwoo Choi, Gaurav Sharma, Samuel Schulter, Jia-Bin Huang

As the first novelty, we propose an attention mechanism which focuses on more discriminative clips and directly optimizes for video-level (cf.

Ranked #3 on Unsupervised Domain Adaptation on UCF-HMDB

Action Recognition Temporal Action Localization +1

Paper
Add Code

Taming Latent Diffusion Model for Neural Radiance Field Inpainting

no code implementations • 15 Apr 2024 • Chieh Hubert Lin, Changil Kim, Jia-Bin Huang, Qinbo Li, Chih-Yao Ma, Johannes Kopf, Ming-Hsuan Yang, Hung-Yu Tseng

These two problems are further reinforced with the use of pixel-distance losses.

Paper
Add Code

Recent Trends in 3D Reconstruction of General Non-Rigid Scenes

no code implementations • 22 Mar 2024 • Raza Yunus, Jan Eric Lenssen, Michael Niemeyer, Yiyi Liao, Christian Rupprecht, Christian Theobalt, Gerard Pons-Moll, Jia-Bin Huang, Vladislav Golyanik, Eddy Ilg

Reconstructing models of the real world, including 3D geometry, appearance, and motion of real scenes, is essential for computer graphics and computer vision.

3D Reconstruction Navigate

Paper
Add Code

Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos

no code implementations • 19 Mar 2024 • Hadi AlZayer, Zhihao Xia, Xuaner Zhang, Eli Shechtman, Jia-Bin Huang, Michael Gharbi

We show that by using simple segmentations and coarse 2D manipulations, we can synthesize a photorealistic edit faithful to the user's input while addressing second-order effects like harmonizing the lighting and physical interactions between edited objects.

Paper
Add Code

CTGAN: Semantic-guided Conditional Texture Generator for 3D Shapes

no code implementations • 8 Feb 2024 • Yi-Ting Pan, Chai-Rong Lee, Shu-Ho Fan, Jheng-Wei Su, Jia-Bin Huang, Yung-Yu Chuang, Hung-Kuo Chu

The entertainment industry relies on 3D visual content to create immersive experiences, but traditional methods for creating textured 3D models can be time-consuming and subjective.

Image Generation Texture Synthesis

Paper
Add Code

IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images

no code implementations • 23 Jan 2024 • Zhi-Hao Lin, Jia-Bin Huang, Zhengqin Li, Zhao Dong, Christian Richardt, Tuotuo Li, Michael Zollhöfer, Johannes Kopf, Shenlong Wang, Changil Kim

While numerous 3D reconstruction and novel-view synthesis methods allow for photorealistic rendering of a scene from multi-view images easily captured with consumer cameras, they bake illumination in their representations and fall short of supporting advanced applications like material editing, relighting, and virtual object insertion.

3D Reconstruction Inverse Rendering +1

Paper
Add Code

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

no code implementations • 17 Jan 2024 • Yu-Ying Yeh, Jia-Bin Huang, Changil Kim, Lei Xiao, Thu Nguyen-Phuoc, Numair Khan, Cheng Zhang, Manmohan Chandraker, Carl S Marshall, Zhao Dong, Zhengqin Li

In contrast, TextureDreamer can transfer highly detailed, intricate textures from real-world environments to arbitrary objects with only a few casually captured images, potentially significantly democratizing texture creation.

Texture Synthesis

Paper
Add Code

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

no code implementations • 29 Dec 2023 • Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu

This enables our model for video synthesis by editing the first frame with any prevalent I2I models and then propagating edits to successive frames.

Optical Flow Estimation Video-to-Video Synthesis

Paper
Add Code

Fast View Synthesis of Casual Videos

no code implementations • 4 Dec 2023 • Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu

Specifically, we build a global static scene model using an extended plane-based scene representation to synthesize temporally coherent novel video.

Novel View Synthesis

Paper
Add Code

Single-Image 3D Human Digitization with Shape-Guided Diffusion

no code implementations • 15 Nov 2023 • Badour AlBahar, Shunsuke Saito, Hung-Yu Tseng, Changil Kim, Johannes Kopf, Jia-Bin Huang

We present an approach to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image.

Image Generation Inverse Rendering

Paper
Add Code

OmnimatteRF: Robust Omnimatte with 3D Background Modeling

1 code implementation • ICCV 2023 • Geng Lin, Chen Gao, Jia-Bin Huang, Changil Kim, Yipeng Wang, Matthias Zwicker, Ayush Saraf

Video matting has broad applications, from adding interesting effects to casually captured movies to assisting video production professionals.

Image Matting Video Matting

118

Paper
Code

Dynamic Mesh-Aware Radiance Fields

1 code implementation • ICCV 2023 • Yi-Ling Qiao, Alexander Gao, Yiran Xu, Yue Feng, Jia-Bin Huang, Ming C. Lin

Embedding polygonal mesh assets within photorealistic Neural Radience Fields (NeRF) volumes, such that they can be rendered and their dynamics simulated in a physically consistent manner with the NeRF, is under-explored from the system perspective of integrating NeRF into the traditional graphics pipeline.

Paper
Code

3D Motion Magnification: Visualizing Subtle Motions with Time Varying Radiance Fields

no code implementations • 7 Aug 2023 • Brandon Y. Feng, Hadi AlZayer, Michael Rubinstein, William T. Freeman, Jia-Bin Huang

Motion magnification helps us visualize subtle, imperceptible motion.

Motion Magnification

Paper
Add Code

UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video

no code implementations • 15 Jun 2023 • Zhi-Hao Lin, Bohan Liu, Yi-Ting Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang

UrbanIR uses a novel loss to make very good estimates of shadow volumes in the original scene.

Inverse Rendering

Paper
Add Code

Seeing the World through Your Eyes

no code implementations • 15 Jun 2023 • Hadi AlZayer, Kevin Zhang, Brandon Feng, Christopher Metzler, Jia-Bin Huang

The reflective nature of the human eye is an underappreciated source of information about what the world around us looks like.

Paper
Add Code

Grounded Text-to-Image Synthesis with Attention Refocusing

no code implementations • 8 Jun 2023 • Quynh Phung, Songwei Ge, Jia-Bin Huang

Driven by the scalable diffusion models trained on large-scale datasets, text-to-image synthesis methods have shown compelling results.

Image Generation

Paper
Add Code

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models

no code implementations • ICCV 2023 • Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji

Despite tremendous progress in generating high-quality images using diffusion models, synthesizing a sequence of animated frames that are both photorealistic and temporally coherent is still in its infancy.

Ranked #8 on Text-to-Video Generation on UCF-101

Image Generation Text-to-Video Generation +1

Paper
Add Code

Neural-PBIR Reconstruction of Shape, Material, and Illumination

no code implementations • ICCV 2023 • Cheng Sun, Guangyan Cai, Zhengqin Li, Kai Yan, Cheng Zhang, Carl Marshall, Jia-Bin Huang, Shuang Zhao, Zhao Dong

In the last stage, initialized by the neural predictions, we perform PBIR to refine the initial results and obtain the final high-quality reconstruction of object shape, material, and illumination.

Ranked #1 on Depth Prediction on Stanford-ORB

Depth Prediction Image Relighting +5

Paper
Add Code

Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation

no code implementations • 17 Apr 2023 • Jie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin

In contrast, we propose a parameter-free temporal shift module that can leverage the spatial U-Net as is for video generation.

Super-Resolution Text-to-Image Generation +2

Paper
Add Code

Expressive Text-to-Image Generation with Rich Text

no code implementations • ICCV 2023 • Songwei Ge, Taesung Park, Jun-Yan Zhu, Jia-Bin Huang

For each region, we enforce its text attributes by creating region-specific detailed prompts and applying region-specific guidance, and maintain its fidelity against plain-text generation through region-based injections.

Text Generation Text-to-Image Generation

Paper
Add Code

$\text{DC}^2$: Dual-Camera Defocus Control by Learning to Refocus

no code implementations • 6 Apr 2023 • Hadi AlZayer, Abdullah Abuolaim, Leung Chun Chan, Yang Yang, Ying Chen Lou, Jia-Bin Huang, Abhishek Kar

Smartphone cameras today are increasingly approaching the versatility and quality of professional cameras through a combination of hardware and software advancements.

Deblurring

Paper
Add Code

Consistent View Synthesis with Pose-Guided Diffusion Models

no code implementations • CVPR 2023 • Hung-Yu Tseng, Qinbo Li, Changil Kim, Suhib Alsisan, Jia-Bin Huang, Johannes Kopf

In this work, we propose a pose-guided diffusion model to generate a consistent long-term video of novel views from a single image.

Novel View Synthesis

Paper
Add Code

Progressively Optimized Local Radiance Fields for Robust View Synthesis

no code implementations • CVPR 2023 • Andreas Meuleman, Yu-Lun Liu, Chen Gao, Jia-Bin Huang, Changil Kim, Min H. Kim, Johannes Kopf

For handling unknown poses, we jointly estimate the camera poses with radiance field in a progressive manner.

Paper
Add Code

DisCO: Portrait Distortion Correction with Perspective-Aware 3D GANs

no code implementations • 23 Feb 2023 • Zhixiang Wang, Yu-Lun Liu, Jia-Bin Huang, Shin'ichi Satoh, Sizhuo Ma, Gurunandan Krishnan, Jian Wang

Close-up facial images captured at short distances often suffer from perspective distortion, resulting in exaggerated facial features and unnatural/unattractive appearances.

Scheduling

Paper
Add Code

Text-driven Visual Synthesis with Latent Diffusion Prior

no code implementations • 16 Feb 2023 • Ting-Hsuan Liao, Songwei Ge, Yiran Xu, Yao-Chih Lee, Badour AlBahar, Jia-Bin Huang

There has been tremendous progress in large-scale text-to-image synthesis driven by diffusion models enabling versatile downstream applications such as 3D object synthesis from texts, image editing, and customized generation.

Image Generation Text to 3D

Paper
Add Code

In-N-Out: Faithful 3D GAN Inversion with Volumetric Decomposition for Face Editing

no code implementations • 9 Feb 2023 • Yiran Xu, Zhixin Shu, Cameron Smith, Seoung Wug Oh, Jia-Bin Huang

3D-aware GANs offer new capabilities for view synthesis while preserving the editing functionalities of their 2D counterparts.

Paper
Add Code

Shape-aware Text-driven Layered Video Editing

no code implementations • CVPR 2023 • Yao-Chih Lee, Ji-Ze Genevieve Jang, Yi-Ting Chen, Elizabeth Qiu, Jia-Bin Huang

Temporal consistency is essential for video editing applications.

Video Editing

Paper
Add Code

HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling

1 code implementation • CVPR 2023 • Benjamin Attal, Jia-Bin Huang, Christian Richardt, Michael Zollhoefer, Johannes Kopf, Matthew O'Toole, Changil Kim

Volumetric scene representations enable photorealistic view synthesis for static scenes and form the basis of several existing 6-DoF video techniques.

Ranked #1 on Novel View Synthesis on DONeRF: Evaluation Dataset

Novel View Synthesis Vocal Bursts Intensity Prediction

477

Paper
Code

Robust Dynamic Radiance Fields

1 code implementation • CVPR 2023 • Yu-Lun Liu, Chen Gao, Andreas Meuleman, Hung-Yu Tseng, Ayush Saraf, Changil Kim, Yung-Yu Chuang, Johannes Kopf, Jia-Bin Huang

Dynamic radiance field reconstruction methods aim to model the time-varying structure and appearance of a dynamic scene.

195

Paper
Code

3D Motion Magnification: Visualizing Subtle Motions from Time-Varying Radiance Fields

no code implementations • ICCV 2023 • Brandon Y. Feng, Hadi AlZayer, Michael Rubinstein, William T. Freeman, Jia-Bin Huang

Motion magnification helps us visualize subtle, imperceptible motion.

Motion Magnification

Paper
Add Code

DC2: Dual-Camera Defocus Control by Learning To Refocus

no code implementations • CVPR 2023 • Hadi AlZayer, Abdullah Abuolaim, Leung Chun Chan, Yang Yang, Ying Chen Lou, Jia-Bin Huang, Abhishek Kar

Smartphone cameras today are increasingly approaching the versatility and quality of professional cameras through a combination of hardware and software advancements.

Deblurring

Paper
Add Code

ClimateNeRF: Extreme Weather Synthesis in Neural Radiance Field

no code implementations • ICCV 2023 • Yuan Li, Zhi-Hao Lin, David Forsyth, Jia-Bin Huang, Shenlong Wang

Physical simulations produce excellent predictions of weather effects.

Neural Rendering Physical Simulations

Paper
Add Code

AMICO: Amodal Instance Composition

no code implementations • 11 Oct 2022 • Peiye Zhuang, Jia-Bin Huang, Ayush Saraf, Xuejian Rong, Changil Kim, Denis Demandolx

Image composition aims to blend multiple objects to form a harmonized image.

Object

Paper
Add Code

Temporally Consistent Semantic Video Editing

no code implementations • 21 Jun 2022 • Yiran Xu, Badour AlBahar, Jia-Bin Huang

Generative adversarial networks (GANs) have demonstrated impressive image generation quality and semantic editing capability of real images, e. g., changing object classes, modifying attributes, or transferring styles.

Image Generation Video Editing

Paper
Add Code

Learning Dynamic View Synthesis With Few RGBD Cameras

no code implementations • 22 Apr 2022 • Shengze Wang, Youngjoong Kwon, Yuan Shen, Qian Zhang, Andrei State, Jia-Bin Huang, Henry Fuchs

Experiments on the HTI dataset show that our method outperforms the baseline per-frame image fidelity and spatial-temporal consistency.

Novel View Synthesis

Paper
Add Code

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer

1 code implementation • 7 Apr 2022 • Songwei Ge, Thomas Hayes, Harry Yang, Xi Yin, Guan Pang, David Jacobs, Jia-Bin Huang, Devi Parikh

Videos are created to express emotion, exchange information, and share experiences.

Ranked #15 on Video Generation on UCF-101

Video Generation

237

Paper
Code

Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature

1 code implementation • CVPR 2022 • Zhixiang Wang, Xiang Ji, Jia-Bin Huang, Shin'ichi Satoh, Xiao Zhou, Yinqiang Zheng

In this paper, we investigate using rolling shutter with a global reset feature (RSGR) to restore clean global shutter (GS) videos.

Image-to-Image Translation Motion Estimation

Paper
Code

Learning Instance-Specific Adaptation for Cross-Domain Segmentation

no code implementations • 30 Mar 2022 • Yuliang Zou, Zizhao Zhang, Chun-Liang Li, Han Zhang, Tomas Pfister, Jia-Bin Huang

We propose a test-time adaptation method for cross-domain image segmentation.

Data Augmentation Domain Generalization +5

Paper
Add Code

Learning Neural Light Fields With Ray-Space Embedding

no code implementations • CVPR 2022 • Benjamin Attal, Jia-Bin Huang, Michael Zollhöfer, Johannes Kopf, Changil Kim

Our method supports rendering with a single network evaluation per pixel for small baseline light fields and with only a few evaluations per pixel for light fields with larger baselines.

Paper
Add Code

Boosting View Synthesis With Residual Transfer

no code implementations • CVPR 2022 • Xuejian Rong, Jia-Bin Huang, Ayush Saraf, Changil Kim, Johannes Kopf

We present a simple but effective technique to boost the rendering quality, which can be easily integrated with most view synthesis methods.

Novel View Synthesis

Paper
Add Code

Learning Neural Light Fields with Ray-Space Embedding Networks

1 code implementation • 2 Dec 2021 • Benjamin Attal, Jia-Bin Huang, Michael Zollhoefer, Johannes Kopf, Changil Kim

Our method supports rendering with a single network evaluation per pixel for small baseline light field datasets and can also be applied to larger baselines with only a few evaluations per pixel.

180

Paper
Code

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

no code implementations • 13 Sep 2021 • Badour AlBahar, Jingwan Lu, Jimei Yang, Zhixin Shu, Eli Shechtman, Jia-Bin Huang

We present an algorithm for re-rendering a person from a single image under arbitrary poses.

Image Generation

Paper
Add Code

Dynamic View Synthesis from Dynamic Monocular Video

1 code implementation • ICCV 2021 • Chen Gao, Ayush Saraf, Johannes Kopf, Jia-Bin Huang

We present an algorithm for generating novel views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene.

208

Paper
Code

DropLoss for Long-Tail Instance Segmentation

1 code implementation • 13 Apr 2021 • Ting-I Hsieh, Esther Robb, Hwann-Tzong Chen, Jia-Bin Huang

Based on this insight, we develop DropLoss -- a novel adaptive loss to compensate for this imbalance without a trade-off between rare and frequent categories.

Instance Segmentation object-detection +3

Paper
Code

Learning Representational Invariances for Data-Efficient Action Recognition

1 code implementation • 30 Mar 2021 • Yuliang Zou, Jinwoo Choi, Qitong Wang, Jia-Bin Huang

Data augmentation is a ubiquitous technique for improving image classification when labeled data is scarce.

Action Recognition Data Augmentation +1

Paper
Code

Hybrid Neural Fusion for Full-frame Video Stabilization

2 code implementations • ICCV 2021 • Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

Existing video stabilization methods often generate visible distortion or require aggressive cropping of frame boundaries, resulting in smaller field of views.

Video Stabilization

519

Paper
Code

Robust Consistent Video Depth Estimation

1 code implementation • CVPR 2021 • Johannes Kopf, Xuejian Rong, Jia-Bin Huang

We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.

Depth Estimation

Paper
Code

Portrait Neural Radiance Fields from a Single Image

no code implementations • 10 Dec 2020 • Chen Gao, YiChang Shih, Wei-Sheng Lai, Chia-Kai Liang, Jia-Bin Huang

We present a method for estimating Neural Radiance Fields (NeRF) from a single headshot portrait.

Meta-Learning

Paper
Add Code

Space-time Neural Irradiance Fields for Free-Viewpoint Video

no code implementations • CVPR 2021 • Wenqi Xian, Jia-Bin Huang, Johannes Kopf, Changil Kim

We present a method that learns a spatiotemporal neural irradiance field for dynamic scenes from a single video.

Depth Estimation

Paper
Add Code

Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

1 code implementation • 2 Nov 2020 • Qi Mao, Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Siwei Ma, Ming-Hsuan Yang

Generating a smooth sequence of intermediate results bridges the gap of two different domains, facilitating the morphing effect across domains.

Attribute Image-to-Image Translation +1

Paper
Code

Few-Shot Adaptation of Generative Adversarial Networks

1 code implementation • 22 Oct 2020 • Esther Robb, Wen-Sheng Chu, Abhishek Kumar, Jia-Bin Huang

We validate our method in a challenging few-shot setting of 5-100 images in the target domain.

Image Generation

119

Paper
Code

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

2 code implementations • ICLR 2021 • Yuliang Zou, Zizhao Zhang, Han Zhang, Chun-Liang Li, Xiao Bian, Jia-Bin Huang, Tomas Pfister

We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.

Ranked #5 on Semi-Supervised Semantic Segmentation on COCO 1/32 labeled

Data Augmentation Image Classification +2

160

Paper
Code

Flow-edge Guided Video Completion

1 code implementation • ECCV 2020 • Chen Gao, Ayush Saraf, Jia-Bin Huang, Johannes Kopf

We present a new flow-based video completion algorithm.

Ranked #4 on Video Inpainting on DAVIS

Video Inpainting

1,544

Paper
Code

NAS-DIP: Learning Deep Image Prior with Neural Architecture Search

1 code implementation • ECCV 2020 • Yun-Chun Chen, Chen Gao, Esther Robb, Jia-Bin Huang

Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior for solving various inverse image restoration tasks.

Image Restoration Image-to-Image Translation +2

126

Paper
Code

DRG: Dual Relation Graph for Human-Object Interaction Detection

1 code implementation • ECCV 2020 • Chen Gao, Jiarui Xu, Yuliang Zou, Jia-Bin Huang

We tackle the challenging problem of human-object interaction (HOI) detection.

Ranked #26 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object +1

Paper
Code

Semantic View Synthesis

1 code implementation • ECCV 2020 • Hsin-Ping Huang, Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang

We tackle a new problem of semantic view synthesis -- generating free-viewpoint rendering of a synthesized scene using a semantic label map as input.

Image Generation

Paper
Code

Learning to See Through Obstructions with Layered Decomposition

1 code implementation • 11 Aug 2020 • Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions, or adherent raindrops, from a short sequence of images captured by a moving camera.

Optical Flow Estimation

Paper
Code

Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling

no code implementations • ECCV 2020 • Yuliang Zou, Pan Ji, Quoc-Huy Tran, Jia-Bin Huang, Manmohan Chandraker

Monocular visual odometry (VO) suffers severely from error accumulation during frame-to-frame pose estimation.

Monocular Visual Odometry Pose Estimation +2

Paper
Add Code

FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

2 code implementations • ECCV 2020 • Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira

Recent state-of-the-art semi-supervised learning (SSL) methods use a combination of image-based transformations and consistency regularization as core components.

Ranked #1 on Semi-Supervised Image Classification on Mini-ImageNet, 10000 Labels

Clustering Data Augmentation +1

Paper
Code

Instance-aware Image Colorization

2 code implementations • CVPR 2020 • Jheng-Wei Su, Hung-Kuo Chu, Jia-Bin Huang

Previous methods leverage the deep neural network to map input grayscale images to plausible color outputs directly.

Ranked #2 on Point-interactive Image Colorization on CUB-200-2011 (using extra training data)

Image Colorization Object +1

706

Paper
Code

Consistent Video Depth Estimation

3 code implementations • 30 Apr 2020 • Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, Johannes Kopf

We present an algorithm for reconstructing dense, geometrically consistent depth for all pixels in a monocular video.

Depth Estimation Monocular Reconstruction

1,579

Paper
Code

3D Photography using Context-aware Layered Depth Inpainting

1 code implementation • CVPR 2020 • Meng-Li Shih, Shih-Yang Su, Johannes Kopf, Jia-Bin Huang

We propose a method for converting a single RGB-D input image into a 3D photo - a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view.

Novel View Synthesis

6,825

Paper
Code

Learning to See Through Obstructions

1 code implementation • CVPR 2020 • Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions or raindrops, from a short sequence of images captured by a moving camera.

Optical Flow Estimation Reflection Removal

996

Paper
Code

Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

1 code implementation • CVPR 2020 • Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

We model the HDRto-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization.

Ranked #3 on Inverse-Tone-Mapping on MSU HDR Video Reconstruction Benchmark

HDR Reconstruction Inverse-Tone-Mapping +2

521

Paper
Code

Deep Semantic Matching with Foreground Detection and Cycle-Consistency

no code implementations • 31 Mar 2020 • Yun-Chun Chen, Po-Hsiang Huang, Li-Yu Yu, Jia-Bin Huang, Ming-Hsuan Yang, Yen-Yu Lin

Establishing dense semantic correspondences between object instances remains a challenging problem due to background clutter, significant scale and pose differences, and large intra-class variations.

Paper
Add Code

Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation

1 code implementation • ICLR 2020 • Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Ming-Hsuan Yang

Few-shot classification aims to recognize novel categories with only few labeled images in each class.

Ranked #6 on Cross-Domain Few-Shot on CUB

Classification Cross-Domain Few-Shot +2

318

Paper
Code

CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency

no code implementations • CVPR 2019 • Yun-Chun Chen, Yen-Yu Lin, Ming-Hsuan Yang, Jia-Bin Huang

Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another (e. g., synthetic to real images).

Data Augmentation Image-to-Image Translation +3

Paper
Add Code

Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition

1 code implementation • NeurIPS 2019 • Jinwoo Choi, Chen Gao, Joseph C. E. Messou, Jia-Bin Huang

We validate the effectiveness of our method by transferring our pre-trained model to three different tasks, including action classification, temporal localization, and spatio-temporal action detection.

Action Classification Action Detection +4

Paper
Code

Guided Image-to-Image Translation with Bi-Directional Feature Transformation

1 code implementation • ICCV 2019 • Badour AlBahar, Jia-Bin Huang

We address the problem of guided image-to-image translation where we translate an input image into another while respecting the constraints provided by an external, user-provided guidance image.

Ranked #1 on Image Reconstruction on Edge-to-Clothes

Image-to-Image Translation Pose Transfer +1

195

Paper
Code

Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

1 code implementation • 13 Jun 2019 • Yun-Chun Chen, Yen-Yu Lin, Ming-Hsuan Yang, Jia-Bin Huang

In contrast to existing algorithms that tackle the tasks of semantic matching and object co-segmentation in isolation, our method exploits the complementary nature of the two tasks.

Object Segmentation +1

Paper
Code

Manifold Graph with Learned Prototypes for Semi-Supervised Image Classification

no code implementations • 12 Jun 2019 • Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira

We then show that when combined with these regularizers, the proposed method facilitates the propagation of information from generated prototypes to image data to further improve results.

Classification General Classification +1

Paper
Add Code

DRIT++: Diverse Image-to-Image Translation via Disentangled Representations

4 code implementations • 2 May 2019 • Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Singh, Ming-Hsuan Yang

In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images.

Attribute Image-to-Image Translation +2

832

Paper
Code

A Closer Look at Few-shot Classification

13 code implementations • ICLR 2019 • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

Few-shot classification aims to learn a classifier to recognize unseen classes during training with limited labeled examples.

Ranked #4 on Few-Shot Image Classification on Dirichlet CUB-200 (5-way, 5-shot)

Domain Generalization Few-Shot Image Classification +2

1,107

Paper
Code

Deep Paper Gestalt

2 code implementations • 20 Dec 2018 • Jia-Bin Huang

Recent years have witnessed a significant increase in the number of paper submissions to computer vision conferences.

437

Paper
Code

DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency

1 code implementation • ECCV 2018 • Yuliang Zou, Zelun Luo, Jia-Bin Huang

We present an unsupervised learning framework for simultaneously training single-view depth prediction and optical flow estimation models using unlabeled video sequences.

Depth And Camera Motion Depth Prediction +1

210

Paper
Code

Unsupervised Video Object Segmentation using Motion Saliency-Guided Spatio-Temporal Propagation

no code implementations • ECCV 2018 • Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing

We even demonstrate competitive results comparable to deep learning based methods in the semi-supervised setting on the DAVIS dataset.

Ranked #3 on Video Salient Object Detection on DAVSOD-Difficult20 (using extra training data)

Optical Flow Estimation Saliency Prediction +6

Paper
Add Code

VideoMatch: Matching based Video Object Segmentation

no code implementations • ECCV 2018 • Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing

Due to the formulation as a prediction task, most of these methods require fine-tuning during test time, such that the deep nets memorize the appearance of the objects of interest in the given video.

Ranked #68 on Semi-Supervised Video Object Segmentation on DAVIS 2017 (val)

Memorization Object +4

Paper
Add Code

iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection

4 code implementations • 30 Aug 2018 • Chen Gao, Yuliang Zou, Jia-Bin Huang

Our core idea is that the appearance of a person or an object instance contains informative cues on which relevant parts of an image to attend to for facilitating interaction prediction.

Ranked #2 on Human-Object Interaction Detection on Ambiguious-HOI

Human-Object Interaction Detection Object

259

Paper
Code

Diverse Image-to-Image Translation via Disentangled Representations

7 code implementations • ECCV 2018 • Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Kumar Singh, Ming-Hsuan Yang

Our model takes the encoded content features extracted from a given input and the attribute vectors sampled from the attribute space to produce diverse outputs at test time.

Ranked #4 on Multimodal Unsupervised Image-To-Image Translation on CelebA-HQ

Attribute Domain Adaptation +4

832

Paper
Code

Learning Blind Video Temporal Consistency

1 code implementation • ECCV 2018 • Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, Ming-Hsuan Yang

Our method takes the original unprocessed and per-frame processed videos as inputs to produce a temporally consistent video.

Colorization Image-to-Image Translation +4

403

Paper
Code

DeepMVS: Learning Multi-view Stereopsis

1 code implementation • CVPR 2018 • Po-Han Huang, Kevin Matzen, Johannes Kopf, Narendra Ahuja, Jia-Bin Huang

We present DeepMVS, a deep convolutional neural network (ConvNet) for multi-view stereo reconstruction.

328

Paper
Code

MaskRNN: Instance Level Video Object Segmentation

no code implementations • NeurIPS 2017 • Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing

Instance level video object segmentation is an important technique for video editing and compression.

Object Segmentation +4

Paper
Add Code

Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks

no code implementations • NeurIPS 2017 • Wei-Sheng Lai, Jia-Bin Huang, Ming-Hsuan Yang

Convolutional neural networks (CNNs) have recently been applied to the optical flow estimation problem.

Generative Adversarial Network Optical Flow Estimation

Paper
Add Code

Progressive Representation Adaptation for Weakly Supervised Object Localization

1 code implementation • 12 Oct 2017 • Dong Li, Jia-Bin Huang, Ya-Li Li, Shengjin Wang, Ming-Hsuan Yang

In classification adaptation, we transfer a pre-trained network to a multi-label classification task for recognizing the presence of a certain object in an image.

Classification General Classification +4

Paper
Code

Joint Image Filtering with Deep Convolutional Networks

no code implementations • 11 Oct 2017 • Yijun Li, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang

In contrast to existing methods that consider only the guidance image, the proposed algorithm can selectively transfer salient structures that are consistent with both guidance and target images.

Paper
Add Code

Tracking Persons-of-Interest via Unsupervised Representation Adaptation

2 code implementations • 5 Oct 2017 • Shun Zhang, Jia-Bin Huang, Jongwoo Lim, Yihong Gong, Jinjun Wang, Narendra Ahuja, Ming-Hsuan Yang

Multi-face tracking in unconstrained videos is a challenging problem as faces of one person often appear drastically different in multiple shots due to significant variations in scale, pose, expression, illumination, and make-up.

Clustering

Paper
Code

Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks

7 code implementations • 4 Oct 2017 • Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang

However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results.

Image Reconstruction Image Super-Resolution

213

Paper
Code

Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight

2 code implementations • 2 Oct 2017 • Yen-Chen Lin, Ming-Yu Liu, Min Sun, Jia-Bin Huang

Our core idea is that the adversarial examples targeting at a neural network-based policy are not effective for the frame prediction model.

Autonomous Vehicles Decision Making +2

Paper
Code

Unsupervised Representation Learning by Sorting Sequences

1 code implementation • ICCV 2017 • Hsin-Ying Lee, Jia-Bin Huang, Maneesh Singh, Ming-Hsuan Yang

We present an unsupervised representation learning approach using videos without semantic labels.

Ranked #46 on Self-Supervised Action Recognition on HMDB51

Image Classification object-detection +4

Paper
Code

Robust Visual Tracking via Hierarchical Convolutional Features

1 code implementation • 12 Jul 2017 • Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

Specifically, we learn adaptive correlation filters on the outputs from each convolutional layer to encode the target appearance.

Object Recognition Visual Tracking

Paper
Code

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

1 code implementation • 7 Jul 2017 • Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

Second, we learn a correlation filter over a feature pyramid centered at the estimated target position for predicting scale changes.

Object Tracking Position

154

Paper
Code

Removing Rain From Single Images via a Deep Detail Network

no code implementations • CVPR 2017 • Xueyang Fu, Jia-Bin Huang, Delu Zeng, Yue Huang, Xinghao Ding, John Paisley

We propose a new deep network architecture for removing rain streaks from individual images based on the deep convolutional neural network (CNN).

Denoising Rain Removal

Paper
Add Code

Learning Structured Semantic Embeddings for Visual Recognition

no code implementations • 5 Jun 2017 • Dong Li, Hsin-Ying Lee, Jia-Bin Huang, Shengjin Wang, Ming-Hsuan Yang

First, we exploit the discriminative constraints to capture the intra- and inter-class relationships of image embeddings.

General Classification Multi-Label Classification +2

Paper
Add Code

Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

1 code implementation • CVPR 2017 • Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, Ming-Hsuan Yang

Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image super-resolution.

Ranked #40 on Image Super-Resolution on BSD100 - 4x upscaling

Image Super-Resolution

Paper
Code

Clearing the Skies: A deep network architecture for single-image rain removal

2 code implementations • 7 Sep 2016 • Xueyang Fu, Jia-Bin Huang, Xinghao Ding, Yinghao Liao, John Paisley

We introduce a deep network architecture called DerainNet for removing rain streaks from an image.

Ranked #11 on Single Image Deraining on Test100 (SSIM metric)

Image Enhancement Single Image Deraining

Paper
Code

A Comparative Study for Single Image Blind Deblurring

no code implementations • CVPR 2016 • Wei-Sheng Lai, Jia-Bin Huang, Zhe Hu, Narendra Ahuja, Ming-Hsuan Yang

Using these datasets, we conduct a large-scale user study to quantify the performance of several representative state-of-the-art blind deblurring algorithms.

Single-Image Blind Deblurring

Paper
Add Code

Detecting Migrating Birds at Night

no code implementations • CVPR 2016 • Jia-Bin Huang, Rich Caruana, Andrew Farnsworth, Steve Kelling, Narendra Ahuja

In this paper, we present a vision-based system for detecting migrating birds in flight at night.

Paper
Add Code

Weakly Supervised Object Localization With Progressive Domain Adaptation

no code implementations • CVPR 2016 • Dong Li, Jia-Bin Huang, Ya-Li Li, Shengjin Wang, Ming-Hsuan Yang

In this paper, we address this problem by progressive domain adaptation with two main steps: classification adaptation and detection adaptation.

Classification Domain Adaptation +5

Paper
Add Code

Hierarchical Convolutional Features for Visual Tracking

no code implementations • ICCV 2015 • Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

The outputs of the last convolutional layers encode the semantic information of targets and such representations are robust to significant appearance variations.

Object Recognition Visual Object Tracking +1

Paper
Add Code

Single Image Super-Resolution From Transformed Self-Exemplars

no code implementations • CVPR 2015 • Jia-Bin Huang, Abhishek Singh, Narendra Ahuja

However, the internal dictionary obtained from the given image may not always be sufficiently expressive to cover the textural appearance variations in the scene.

Image Super-Resolution

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.