Search Results for author: Yi-Hsuan Tsai

Found 53 papers, 29 papers with code

Colorization of Depth Map via Disentanglement

1 code implementation ECCV 2020 Chung-Sheng Lai, Zunzhi You, Ching-Chun Huang, Yi-Hsuan Tsai, Wei-Chen Chiu

Vision perception is one of the most important components for a computer or robot to understand the surrounding scene and achieve autonomous applications.

Colorization Disentanglement

360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning

no code implementations24 Oct 2022 Bolivar Solarte, Chin-Hsuan Wu, Yueh-Cheng Liu, Yi-Hsuan Tsai, Min Sun

In addition, since ground truth annotations are not available during training nor in testing, we leverage the entropy information in multiple layout estimations as a quantitative metric to measure the geometry consistency of the scene, allowing us to evaluate any layout estimator for hyper-parameter tuning, including model selection without ground truth annotations.

Model Selection Pseudo Label

3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling

1 code implementation19 Sep 2022 Yu-Ting Yen, Chia-Ni Lu, Wei-Chen Chiu, Yi-Hsuan Tsai

In this paper, we develop a domain adaptation framework via generating reliable pseudo ground truths of depth from real data to provide direct supervisions.

Monocular Depth Estimation Point Cloud Completion +1

BiFuse++: Self-supervised and Efficient Bi-projection Fusion for 360 Depth Estimation

1 code implementation7 Sep 2022 Fu-En Wang, Yu-Hsuan Yeh, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

Thus, state-of-the-art frameworks for monocular 360 depth estimation such as bi-projection fusion in BiFuse are proposed.

Monocular Depth Estimation

On Generalizing Beyond Domains in Cross-Domain Continual Learning

no code implementations CVPR 2022 Christian Simon, Masoud Faraki, Yi-Hsuan Tsai, Xiang Yu, Samuel Schulter, Yumin Suh, Mehrtash Harandi, Manmohan Chandraker

Humans have the ability to accumulate knowledge of new tasks in varying conditions, but deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.

Continual Learning Knowledge Distillation

Learning Semantic Segmentation from Multiple Datasets with Label Shifts

no code implementations28 Feb 2022 Dongwan Kim, Yi-Hsuan Tsai, Yumin Suh, Masoud Faraki, Sparsh Garg, Manmohan Chandraker, Bohyung Han

First, a gradient conflict in training due to mismatched label spaces is identified and a class-independent binary cross-entropy loss is proposed to alleviate such label conflicts.

Semantic Segmentation

Self-Supervised Feature Learning from Partial Point Clouds via Pose Disentanglement

no code implementations9 Jan 2022 Meng-Shiun Tsai, Pei-Ze Chiang, Yi-Hsuan Tsai, Wei-Chen Chiu

Self-supervised learning on point clouds has gained a lot of attention recently, since it addresses the label-efficiency and domain-gap problems on point cloud tasks.

Disentanglement Self-Supervised Learning

360-DFPE: Leveraging Monocular 360-Layouts for Direct Floor Plan Estimation

1 code implementation12 Dec 2021 Bolivar Solarte, Yueh-Cheng Liu, Chin-Hsuan Wu, Yi-Hsuan Tsai, Min Sun

We present 360-DFPE, a sequential floor plan estimation method that directly takes 360-images as input without relying on active sensors or 3D information.

Visual Odometry

Semi-supervised Multi-task Learning for Semantics and Depth

no code implementations14 Oct 2021 Yufeng Wang, Yi-Hsuan Tsai, Wei-Chih Hung, Wenrui Ding, Shuo Liu, Ming-Hsuan Yang

Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.

Depth Estimation Multi-Task Learning +1

Towards Interpretable Deep Networks for Monocular Depth Estimation

1 code implementation ICCV 2021 Zunzhi You, Yi-Hsuan Tsai, Wei-Chen Chiu, Guanbin Li

Based on our observations, we quantify the interpretability of a deep MDE network by the depth selectivity of its hidden units.

Monocular Depth Estimation

End-to-end Multi-modal Video Temporal Grounding

1 code implementation NeurIPS 2021 Yi-Wen Chen, Yi-Hsuan Tsai, Ming-Hsuan Yang

Specifically, we adopt RGB images for appearance, optical flow for motion, and depth maps for image structure.

Optical Flow Estimation Self-Supervised Learning

LED2-Net: Monocular 360deg Layout Estimation via Differentiable Depth Rendering

no code implementations CVPR 2021 Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space.

Depth Estimation Depth Prediction +1

Robust 360-8PA: Redesigning The Normalized 8-point Algorithm for 360-FoV Images

1 code implementation22 Apr 2021 Bolivar Solarte, Chin-Hsuan Wu, Kuan-Wei Lu, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

This paper presents a novel preconditioning strategy for the classic 8-point algorithm (8-PA) for estimating an essential matrix from 360-FoV images (i. e., equirectangular images) in spherical projection.

Understanding Synonymous Referring Expressions via Contrastive Features

1 code implementation20 Apr 2021 Yi-Wen Chen, Yi-Hsuan Tsai, Ming-Hsuan Yang

While prior work usually treats each sentence and attends it to an object separately, we focus on learning a referring expression comprehension model that considers the property in synonymous sentences.

Referring Expression Referring Expression Comprehension +1

LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering

1 code implementation1 Apr 2021 Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space.

Depth Estimation Depth Prediction +1

Cross-Domain Similarity Learning for Face Recognition in Unseen Domains

no code implementations CVPR 2021 Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker

Intuitively, it discriminatively correlates explicit metrics derived from one domain, with triplet samples from another domain in a unified loss function to be minimized within a network, which leads to better alignment of the training domains.

Face Recognition Metric Learning

Dual-Stream Fusion Network for Spatiotemporal Video Super-Resolution

1 code implementation Winter Conference on Applications of Computer Vision (WACV) 2021 Min-Yuan Tseng, Yen-Chung Chen, Yi-Lun Lee, Wei-Sheng Lai, Yi-Hsuan Tsai, Wei-Chen Chiu

Our method is based on an important observation that: even the direct cascade of prior research in spatial and temporal super-resolution can achieve the spatiotemporal upsampling, changing orders for combining them would lead to results with a complementary property.

Image Super-Resolution Video Super-Resolution

Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector

1 code implementation ECCV 2020 Cheng-Chun Hsu, Yi-Hsuan Tsai, Yen-Yu Lin, Ming-Hsuan Yang

A domain adaptive object detector aims to adapt itself to unseen domains that may contain variations of object appearance, viewpoints or backgrounds.

Domain Adaptation

Object Detection with a Unified Label Space from Multiple Datasets

no code implementations ECCV 2020 Xiangyun Zhao, Samuel Schulter, Gaurav Sharma, Yi-Hsuan Tsai, Manmohan Chandraker, Ying Wu

To address this challenge, we design a framework which works with such partial annotations, and we exploit a pseudo labeling approach that we adapt for our specific case.

object-detection Object Detection

Learning to Caricature via Semantic Shape Transform

1 code implementation12 Aug 2020 Wenqing Chu, Wei-Chih Hung, Yi-Hsuan Tsai, Yu-Ting Chang, Yijun Li, Deng Cai, Ming-Hsuan Yang

Caricature is an artistic drawing created to abstract or exaggerate facial features of a person.


Regularizing Meta-Learning via Gradient Dropout

1 code implementation13 Apr 2020 Hung-Yu Tseng, Yi-Wen Chen, Yi-Hsuan Tsai, Sifei Liu, Yen-Yu Lin, Ming-Hsuan Yang

With the growing attention on learning-to-learn new tasks using only a few examples, meta-learning has been widely used in numerous problems such as few-shot classification, reinforcement learning, and domain generalization.

Domain Generalization Meta-Learning

LayoutMP3D: Layout Annotation of Matterport3D

1 code implementation30 Mar 2020 Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Inferring the information of 3D layout from a single equirectangular panorama is crucial for numerous applications of virtual reality or robotics (e. g., scene understanding and navigation).

Scene Understanding

Adversarial Learning of Privacy-Preserving and Task-Oriented Representations

no code implementations22 Nov 2019 Taihong Xiao, Yi-Hsuan Tsai, Kihyuk Sohn, Manmohan Chandraker, Ming-Hsuan Yang

For instance, there could be a potential privacy risk of machine learning systems via the model inversion attack, whose goal is to reconstruct the input data from the latent representation of deep networks.

BIG-bench Machine Learning Perceptual Distance +1

360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume

1 code implementation11 Nov 2019 Ning-Hsu Wang, Bolivar Solarte, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

Recently, end-to-end trainable deep neural networks have significantly improved stereo depth estimation for perspective images.

Stereo Depth Estimation

Referring Expression Object Segmentation with Caption-Aware Consistency

1 code implementation10 Oct 2019 Yi-Wen Chen, Yi-Hsuan Tsai, Tiantian Wang, Yen-Yu Lin, Ming-Hsuan Yang

To this end, we propose an end-to-end trainable comprehension network that consists of the language and visual encoders to extract feature representations from both domains.

Referring Expression Referring Expression Segmentation +1

Adaptation Across Extreme Variations using Unlabeled Domain Bridges

no code implementations5 Jun 2019 Shuyang Dai, Kihyuk Sohn, Yi-Hsuan Tsai, Lawrence Carin, Manmohan Chandraker

We tackle an unsupervised domain adaptation problem for which the domain discrepancy between labeled source and unlabeled target domains is large, due to many factors of inter and intra-domain variation.

Object Recognition Semantic Segmentation +1

Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence

1 code implementation CVPR 2019 Hsueh-Ying Lai, Yi-Hsuan Tsai, Wei-Chen Chiu

In this paper, we propose a single and principled network to jointly learn spatiotemporal correspondence for stereo matching and flow estimation, with a newly designed geometric connection as the unsupervised signal for temporally adjacent stereo pairs.

Optical Flow Estimation Scene Understanding +2

Weakly-supervised Caricature Face Parsing through Domain Adaptation

1 code implementation13 May 2019 Wenqing Chu, Wei-Chih Hung, Yi-Hsuan Tsai, Deng Cai, Ming-Hsuan Yang

However, current state-of-the-art face parsing methods require large amounts of labeled data on the pixel-level and such process for caricature is tedious and labor-intensive.

Caricature Domain Adaptation +2

Domain Adaptation for Structured Output via Disentangled Patch Representations

no code implementations ICLR 2019 Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker

To this end, we propose to learn discriminative feature representations of patches based on label histograms in the source domain, through the construction of a disentangled space.

Domain Adaptation Semantic Segmentation

Active Adversarial Domain Adaptation

no code implementations16 Apr 2019 Jong-Chyi Su, Yi-Hsuan Tsai, Kihyuk Sohn, Buyu Liu, Subhransu Maji, Manmohan Chandraker

Our approach, active adversarial domain adaptation (AADA), explores a duality between two related problems: adversarial domain alignment and importance sampling for adapting models across domains.

Active Learning Domain Adaptation +2

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

1 code implementation5 Apr 2019 Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception.

Depth Completion Stereo-LiDAR Fusion +2

Domain Adaptation for Structured Output via Discriminative Patch Representations

8 code implementations ICCV 2019 Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker

Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks.

Domain Adaptation Semantic Segmentation +1

Unseen Object Segmentation in Videos via Transferable Representations

no code implementations8 Jan 2019 Yi-Wen Chen, Yi-Hsuan Tsai, Chu-Ya Yang, Yen-Yu Lin, Ming-Hsuan Yang

The entire process is decomposed into two tasks: 1) solving a submodular function for selecting object-like segments, and 2) learning a CNN model with a transferable module for adapting seen categories in the source domain to the unseen target video.

Semantic Segmentation

Plug-and-Play: Improve Depth Estimation via Sparse Data Propagation

2 code implementations20 Dec 2018 Tsun-Hsuan Wang, Fu-En Wang, Juan-Ting Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input.

Depth Estimation Depth Prediction

Learning Video-Story Composition via Recurrent Neural Network

no code implementations31 Jan 2018 Guangyu Zhong, Yi-Hsuan Tsai, Sifei Liu, Zhixun Su, Ming-Hsuan Yang

In this paper, we propose a learning-based method to compose a video-story from a group of video clips that describe an activity or experience.

Learning Binary Residual Representations for Domain-specific Video Streaming

no code implementations14 Dec 2017 Yi-Hsuan Tsai, Ming-Yu Liu, Deqing Sun, Ming-Hsuan Yang, Jan Kautz

Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission.

Video Compression

Scene Parsing with Global Context Embedding

1 code implementation ICCV 2017 Wei-Chih Hung, Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang

We present a scene parsing method that utilizes global context information based on both the parametric and non- parametric models.

Scene Parsing

Learning to Segment Instances in Videos with Spatial Propagation Network

no code implementations14 Sep 2017 Jingchun Cheng, Sifei Liu, Yi-Hsuan Tsai, Wei-Chih Hung, Shalini De Mello, Jinwei Gu, Jan Kautz, Shengjin Wang, Ming-Hsuan Yang

In addition, we apply a filter on the refined score map that aims to recognize the best connected region using spatial and temporal consistencies in the video.

Semantic Segmentation

Adaptive Region Pooling for Object Detection

no code implementations CVPR 2015 Yi-Hsuan Tsai, Onur C. Hamsici, Ming-Hsuan Yang

Learning models for object detection is a challenging problem due to the large intra-class variability of objects in appearance, viewpoints, and rigidity.

object-detection Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.