Search Results for author: Cheng-Hao Kuo

Found 11 papers, 4 papers with code

No More Ambiguity in 360° Room Layout via Bi-Layout Estimation

no code implementations • 15 Apr 2024 • Yu-Ju Tsai, Jin-Cheng Jhang, Jingjing Zheng, Wei Wang, Albert Y. C. Chen, Min Sun, Cheng-Hao Kuo, Ming-Hsuan Yang

A unique property of our Bi-Layout model is its ability to inherently detect ambiguous regions by comparing the two predictions.

Room Layout Estimation

Paper
Add Code

PoCo: Point Context Cluster for RGBD Indoor Place Recognition

no code implementations • 3 Apr 2024 • Jing Liang, Zhuo Deng, Zheming Zhou, Omid Ghasemalizadeh, Dinesh Manocha, Min Sun, Cheng-Hao Kuo, Arnie Sen

We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database.

Paper
Add Code

GDA: Generalized Diffusion for Robust Test-time Adaptation

no code implementations • 29 Mar 2024 • Yun-Yun Tsai, Fu-Chen Chen, Albert Y. C. Chen, Junfeng Yang, Che-Chun Su, Min Sun, Cheng-Hao Kuo

For vision tasks, recent studies have shown that test-time adaptation employing diffusion models can achieve state-of-the-art accuracy improvements on OOD samples by generating new samples that align with the model's domain without the need to modify the model's weights.

Test-time Adaptation

Paper
Add Code

Tabletop Transparent Scene Reconstruction via Epipolar-Guided Optical Flow with Monocular Depth Completion Prior

no code implementations • 15 Oct 2023 • Xiaotong Chen, Zheming Zhou, Zhuo Deng, Omid Ghasemalizadeh, Min Sun, Cheng-Hao Kuo, Arnie Sen

Reconstructing transparent objects using affordable RGB-D cameras is a persistent challenge in robotic perception due to inconsistent appearances across views in the RGB domain and inaccurate depth readings in each single-view.

3D Reconstruction Depth Completion +3

Paper
Add Code

ImGeoNet: Image-induced Geometry-aware Voxel Representation for Multi-view 3D Object Detection

no code implementations • ICCV 2023 • Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, Min Sun

The results demonstrate that ImGeoNet outperforms the current state-of-the-art multi-view image-based method, ImVoxelNet, on all three datasets in terms of detection accuracy.

Ranked #24 on 3D Object Detection on ScanNetV2

3D Object Detection object-detection

Paper
Add Code

ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation

1 code implementation • 4 Aug 2023 • Xuefeng Hu, Ke Zhang, Lu Xia, Albert Chen, Jiajia Luo, Yuyin Sun, Ken Wang, Nan Qiao, Xiao Zeng, Min Sun, Cheng-Hao Kuo, Ram Nevatia

Large-scale Pre-Training Vision-Language Model such as CLIP has demonstrated outstanding performance in zero-shot classification, e. g. achieving 76. 3% top-1 accuracy on ImageNet without seeing any example, which leads to potential benefits to many tasks that have no labeled data.

Image Classification Language Modelling +2

Paper
Code

Bidirectional Alignment for Domain Adaptive Detection with Transformers

1 code implementation • ICCV 2023 • Liqiang He, Wei Wang, Albert Chen, Min Sun, Cheng-Hao Kuo, Sinisa Todorovic

We propose a Bidirectional Alignment for domain adaptive Detection with Transformers (BiADT) to improve cross domain object detection performance.

Object object-detection +1

Paper
Code

SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments

1 code implementation • 22 Dec 2022 • Evin Pınar Örnek, Aravindhan K Krishnan, Shreekant Gayaka, Cheng-Hao Kuo, Arnie Sen, Nassir Navab, Federico Tombari

We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the ``objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.

Instance Segmentation Object +2

Paper
Code

Learning Feature Decomposition for Domain Adaptive Monocular Depth Estimation

no code implementations • 30 Jul 2022 • Shao-Yuan Lo, Wei Wang, Jim Thomas, Jingjing Zheng, Vishal M. Patel, Cheng-Hao Kuo

In this paper, we propose a novel UDA method for MDE, referred to as Learning Feature Decomposition for Adaptation (LFDA), which learns to decompose the feature space into content and style components.

Monocular Depth Estimation Unsupervised Domain Adaptation

Paper
Add Code

Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in Audio-Visual Emotion Recognition

no code implementations • 5 Apr 2021 • Haoqi Li, Yelin Kim, Cheng-Hao Kuo, Shrikanth Narayanan

Key challenges in developing generalized automatic emotion recognition systems include scarcity of labeled data and lack of gold-standard references.

Domain Adaptation Emotion Recognition +1

Paper
Add Code

MEBOW: Monocular Estimation of Body Orientation In the Wild

1 code implementation • CVPR 2020 • Chenyan Wu, Yukun Chen, Jiajia Luo, Che-Chun Su, Anuja Dawane, Bikramjot Hanzra, Zhuo Deng, Bilan Liu, James Wang, Cheng-Hao Kuo

We present COCO-MEBOW (Monocular Estimation of Body Orientation in the Wild), a new large-scale dataset for orientation estimation from a single in-the-wild image.

Autonomous Driving Pose Estimation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.