Search Results for author: Wei-Chen Chiu

Found 50 papers, 26 papers with code

Colorization of Depth Map via Disentanglement

1 code implementation ECCV 2020 Chung-Sheng Lai, Zunzhi You, Ching-Chun Huang, Yi-Hsuan Tsai, Wei-Chen Chiu

Vision perception is one of the most important components for a computer or robot to understand the surrounding scene and achieve autonomous applications.

Colorization Disentanglement

A Recipe for CAC: Mosaic-based Generalized Loss for Improved Class-Agnostic Counting

no code implementations15 Apr 2024 Tsung-Han Chou, Brian Wang, Wei-Chen Chiu, Jun-Cheng Chen

Class agnostic counting (CAC) is a vision task that can be used to count the total occurrence number of any given reference objects in the query image.

Benchmarking

MCPNet: An Interpretable Classifier via Multi-Level Concept Prototypes

no code implementations13 Apr 2024 Bor-Shiun Wang, Chien-Yi Wang, Wei-Chen Chiu

Addressing this gap, we introduce the Multi-Level Concept Prototypes Classifier (MCPNet), an inherently interpretable model.

Classification Decision Making

MENTOR: Multilingual tExt detectioN TOward leaRning by analogy

no code implementations12 Mar 2024 Hsin-Ju Lin, Tsu-Chun Chung, Ching-Chun Hsiao, Pin-Yu Chen, Wei-Chen Chiu, Ching-Chun Huang

Text detection is frequently used in vision-based mobile robots when they need to interpret texts in their surroundings to perform a given task.

Few-Shot Learning Scene Text Detection +2

Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields

1 code implementation20 Feb 2024 Bo-Yu Cheng, Wei-Chen Chiu, Yu-Lun Liu

In this paper, we propose an algorithm that allows joint refinement of camera pose and scene geometry represented by decomposed low-rank tensor, using only 2D images as supervision.

Novel View Synthesis

AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors

1 code implementation26 Oct 2023 You-Ming Chang, Chen Yeh, Wei-Chen Chiu, Ning Yu

We formulate deepfake detection as a visual question answering problem, and tune soft prompts for InstructBLIP to distinguish a query image is real or fake.

DeepFake Detection Face Swapping +4

Skin the sheep not only once: Reusing Various Depth Datasets to Drive the Learning of Optical Flow

no code implementations3 Oct 2023 Sheng-Chi Huang, Wei-Chen Chiu

Lastly, as the optical flow maps under different geometric augmentations actually exhibit distinct characteristics, an auxiliary classifier which trains to identify the type of augmentation from the appearance of the flow map is utilized to further enhance the learning of the optical flow estimator.

Depth Estimation Optical Flow Estimation +1

Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where

no code implementations22 Sep 2023 Zhi-Yi Chin, Chieh-Ming Jiang, Ching-Chun Huang, Pin-Yu Chen, Wei-Chen Chiu

While image data starts to enjoy the simple-but-effective self-supervised learning scheme built upon masking and self-reconstruction objective thanks to the introduction of tokenization procedure and vision transformer backbone, convolutional neural networks as another important and widely-adopted architecture for image data, though having contrastive-learning techniques to drive the self-supervised learning, still face the difficulty of leveraging such straightforward and general masking operation to benefit their learning process significantly.

Contrastive Learning Self-Supervised Learning

Transformer-based Image Compression with Variable Image Quality Objectives

no code implementations22 Sep 2023 Chia-Hao Kao, Yi-Hsin Chen, Cheng Chien, Wei-Chen Chiu, Wen-Hsiao Peng

This paper presents a Transformer-based image compression system that allows for a variable image quality objective according to the user's preference.

Image Compression

Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts

1 code implementation12 Sep 2023 Zhi-Yi Chin, Chieh-Ming Jiang, Ching-Chun Huang, Pin-Yu Chen, Wei-Chen Chiu

In this work, we propose Prompting4Debugging (P4D) as a debugging and red-teaming tool that automatically finds problematic prompts for diffusion models to test the reliability of a deployed safety mechanism.

Transformer-based Variable-rate Image Compression with Region-of-interest Control

1 code implementation18 May 2023 Chia-Hao Kao, Ying-Chieh Weng, Yi-Hsin Chen, Wei-Chen Chiu, Wen-Hsiao Peng

Our prompt generation networks generate content-adaptive tokens according to the input image, an ROI mask, and a rate parameter.

Image Compression

Multimodal Prompting with Missing Modalities for Visual Recognition

1 code implementation CVPR 2023 Yi-Lun Lee, Yi-Hsuan Tsai, Wei-Chen Chiu, Chen-Yu Lee

In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when missing-modality occurs either during training or testing in real-world situations; and 2) when the computation resources are not available to finetune on heavy transformer models.

Mitigating Forgetting in Online Continual Learning via Contrasting Semantically Distinct Augmentations

no code implementations10 Nov 2022 Sheng-Feng Yu, Wei-Chen Chiu

Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one, under the constraints of having limited system size and computational cost, in which the main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.

Continual Learning Contrastive Learning

3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling

1 code implementation19 Sep 2022 Yu-Ting Yen, Chia-Ni Lu, Wei-Chen Chiu, Yi-Hsuan Tsai

In this paper, we develop a domain adaptation framework via generating reliable pseudo ground truths of depth from real data to provide direct supervisions.

Monocular Depth Estimation Point Cloud Completion +1

Vector Quantized Image-to-Image Translation

no code implementations27 Jul 2022 Yu-Jie Chen, Shin-I Cheng, Wei-Chen Chiu, Hung-Yu Tseng, Hsin-Ying Lee

For example, it provides style variability for image generation and extension, and equips image-to-image translation with further extension capabilities.

Image-to-Image Translation Quantization +1

Self-Supervised Feature Learning from Partial Point Clouds via Pose Disentanglement

no code implementations9 Jan 2022 Meng-Shiun Tsai, Pei-Ze Chiang, Yi-Hsuan Tsai, Wei-Chen Chiu

Self-supervised learning on point clouds has gained a lot of attention recently, since it addresses the label-efficiency and domain-gap problems on point cloud tasks.

Disentanglement Self-Supervised Learning

Make an Omelette with Breaking Eggs: Zero-Shot Learning for Novel Attribute Synthesis

no code implementations28 Nov 2021 Yu-Hsuan Li, Tzu-Yin Chao, Ching-Chun Huang, Pin-Yu Chen, Wei-Chen Chiu

Basically, given only a small set of detectors that are learned to recognize some manually annotated attributes (i. e., the seen attributes), we aim to synthesize the detectors of novel attributes in a zero-shot learning manner.

Attribute Classification +1

An Unsupervised Video Game Playstyle Metric via State Discretization

1 code implementation3 Oct 2021 Chiu-Chou Lin, Wei-Chen Chiu, I-Chen Wu

In this paper, we propose the first metric for video game playstyles directly from the game observations and actions, without any prior specification on the playstyle in the target game.

Atari Games Car Racing +1

Towards Interpretable Deep Networks for Monocular Depth Estimation

1 code implementation ICCV 2021 Zunzhi You, Yi-Hsuan Tsai, Wei-Chen Chiu, Guanbin Li

Based on our observations, we quantify the interpretability of a deep MDE network by the depth selectivity of its hidden units.

Monocular Depth Estimation

Learning Facial Representations from the Cycle-consistency of Face

1 code implementation ICCV 2021 Jia-Ren Chang, Yong-Sheng Chen, Wei-Chen Chiu

The main idea of the facial motion cycle-consistency is that, given a face with expression, we can perform de-expression to a neutral face via the removal of facial motion and further perform re-expression to reconstruct back to the original face.

Face Reconstruction Facial Expression Recognition +3

MAML is a Noisy Contrastive Learner in Classification

1 code implementation ICLR 2022 Chia-Hsiang Kao, Wei-Chen Chiu, Pin-Yu Chen

Model-agnostic meta-learning (MAML) is one of the most popular and widely adopted meta-learning algorithms, achieving remarkable success in various learning problems.

Classification Few-Shot Learning

LED2-Net: Monocular 360deg Layout Estimation via Differentiable Depth Rendering

no code implementations CVPR 2021 Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space.

Depth Estimation Depth Prediction +1

RPG: Learning Recursive Point Cloud Generation

no code implementations29 May 2021 Wei-Jan Ko, Hui-Yu Huang, Yu-Liang Kuo, Chen-Yi Chiu, Li-Heng Wang, Wei-Chen Chiu

In this paper we propose a novel point cloud generator that is able to reconstruct and generate 3D point clouds composed of semantic parts.

Point Cloud Generation Segmentation +1

Stylizing 3D Scene via Implicit Representation and HyperNetwork

no code implementations27 May 2021 Pei-Ze Chiang, Meng-Shiun Tsai, Hung-Yu Tseng, Wei-Sheng Lai, Wei-Chen Chiu

Our framework consists of two components: an implicit representation of the 3D scene with the neural radiance fields model, and a hypernetwork to transfer the style information into the scene representation.

Novel View Synthesis Style Transfer +1

Robust 360-8PA: Redesigning The Normalized 8-point Algorithm for 360-FoV Images

1 code implementation22 Apr 2021 Bolivar Solarte, Chin-Hsuan Wu, Kuan-Wei Lu, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

This paper presents a novel preconditioning strategy for the classic 8-point algorithm (8-PA) for estimating an essential matrix from 360-FoV images (i. e., equirectangular images) in spherical projection.

LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering

1 code implementation1 Apr 2021 Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space.

3D Room Layouts From A Single RGB Panorama Depth Estimation +2

Bridging the Visual Gap: Wide-Range Image Blending

1 code implementation CVPR 2021 Chia-Ni Lu, Ya-Chu Chang, Wei-Chen Chiu

In this paper we propose a new problem scenario in image processing, wide-range image blending, which aims to smoothly merge two different input photos into a panorama by generating novel image content for the intermediate region between them.

Image Inpainting Image Outpainting

Domain Adaptation for Learning Generator from Paired Few-Shot Data

no code implementations25 Feb 2021 Chun-Chih Teng, Pin-Yu Chen, Wei-Chen Chiu

We propose a Paired Few-shot GAN (PFS-GAN) model for learning generators with sufficient source data and a few target data.

Domain Adaptation Few-Shot Learning

Dual-Stream Fusion Network for Spatiotemporal Video Super-Resolution

1 code implementation Winter Conference on Applications of Computer Vision (WACV) 2021 Min-Yuan Tseng, Yen-Chung Chen, Yi-Lun Lee, Wei-Sheng Lai, Yi-Hsuan Tsai, Wei-Chen Chiu

Our method is based on an important observation that: even the direct cascade of prior research in spatial and temporal super-resolution can achieve the spatiotemporal upsampling, changing orders for combining them would lead to results with a complementary property.

Image Super-Resolution Video Super-Resolution

Spectral Analysis for Semantic Segmentation with Applications on Feature Truncation and Weak Annotation

no code implementations28 Dec 2020 Li-Wei Chen, Wei-Chen Chiu, Chin-Tien Wu

We propose a spectral analysis to investigate the correlations among the resolution of the down sampled grid, the loss function and the accuracy of the SSNNs.

Network Pruning Segmentation +1

Benefiting Deep Latent Variable Models via Learning the Prior and Removing Latent Regularization

no code implementations7 Jul 2020 Rogan Morrow, Wei-Chen Chiu

There exist many forms of deep latent variable models, such as the variational autoencoder and adversarial autoencoder.

Disentanglement Image-to-Image Translation +1

Variational Autoencoders with Normalizing Flow Decoders

no code implementations12 Apr 2020 Rogan Morrow, Wei-Chen Chiu

Recently proposed normalizing flow models such as Glow have been shown to be able to generate high quality, high dimensional images with relatively fast sampling speed.

LayoutMP3D: Layout Annotation of Matterport3D

1 code implementation30 Mar 2020 Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Inferring the information of 3D layout from a single equirectangular panorama is crucial for numerous applications of virtual reality or robotics (e. g., scene understanding and navigation).

Scene Understanding

360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume

1 code implementation11 Nov 2019 Ning-Hsu Wang, Bolivar Solarte, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

Recently, end-to-end trainable deep neural networks have significantly improved stereo depth estimation for perspective images.

Stereo Depth Estimation

Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence

1 code implementation CVPR 2019 Hsueh-Ying Lai, Yi-Hsuan Tsai, Wei-Chen Chiu

In this paper, we propose a single and principled network to jointly learn spatiotemporal correspondence for stereo matching and flow estimation, with a newly designed geometric connection as the unsupervised signal for temporally adjacent stereo pairs.

Optical Flow Estimation Scene Understanding +2

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

1 code implementation5 Apr 2019 Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception.

Depth Completion Stereo-LiDAR Fusion +2

All about Structure: Adapting Structural Information across Domains for Boosting Semantic Segmentation

1 code implementation CVPR 2019 Wei-Lun Chang, Hui-Po Wang, Wen-Hsiao Peng, Wei-Chen Chiu

In this paper we tackle the problem of unsupervised domain adaptation for the task of semantic segmentation, where we attempt to transfer the knowledge learned upon synthetic datasets with ground-truth labels to real-world images without any annotation.

Segmentation Semantic Segmentation +3

Plug-and-Play: Improve Depth Estimation via Sparse Data Propagation

2 code implementations20 Dec 2018 Tsun-Hsuan Wang, Fu-En Wang, Juan-Ting Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input.

Depth Estimation Depth Prediction

Summarizing First-Person Videos from Third Persons' Points of View

no code implementations ECCV 2018 Hsuan-I Ho, Wei-Chen Chiu, Yu-Chiang Frank Wang

Video highlight or summarization is among interesting topics in computer vision, which benefits a variety of applications like viewing, searching, or storage.

Summarizing First-Person Videos from Third Persons' Points of Views

no code implementations ECCV 2018 Hsuan-I Ho, Wei-Chen Chiu, Yu-Chiang Frank Wang

Video highlight or summarization is among interesting topics in computer vision, which benefits a variety of applications like viewing, searching, or storage.

Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation

no code implementations CVPR 2018 Yen-Cheng Liu, Yu-Ying Yeh, Tzu-Chien Fu, Sheng-De Wang, Wei-Chen Chiu, Yu-Chiang Frank Wang

While representation learning aims to derive interpretable features for describing visual data, representation disentanglement further results in such features so that particular image attributes can be identified and manipulated.

Attribute Disentanglement +2

See the Difference: Direct Pre-Image Reconstruction and Pose Estimation by Differentiating HOG

no code implementations ICCV 2015 Wei-Chen Chiu, Mario Fritz

The Histogram of Oriented Gradient (HOG) descriptor has led to many advances in computer vision over the last decade and is still part of many state of the art approaches.

Image Reconstruction Pose Estimation

Multi-class Video Co-segmentation with a Generative Multi-video Model

no code implementations CVPR 2013 Wei-Chen Chiu, Mario Fritz

This is a clear mismatch to the challenges that we are facing with videos from online resources or consumer videos.

Segmentation Video Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.