Search Results for author: Tsung-Han Wu

Found 20 papers, 10 papers with code

See, Say, and Segment: Teaching LMMs to Overcome False Premises

no code implementations • 13 Dec 2023 • Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell

Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image.

Paper
Add Code

Self-correcting LLM-controlled Diffusion Models

no code implementations • 27 Nov 2023 • Tsung-Han Wu, Long Lian, Joseph E. Gonzalez, Boyi Li, Trevor Darrell

Steered by an LLM controller, SLD turns text-to-image generation into an iterative closed-loop process, ensuring correctness in the resulting image.

Attribute Text-to-Image Generation

Paper
Add Code

WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection

1 code implementation • 5 Oct 2023 • Tsung-Lin Tsou, Tsung-Han Wu, Winston H. Hsu

In the field of domain adaptation (DA) on 3D object detection, most of the work is dedicated to unsupervised domain adaptation (UDA).

3D Object Detection object-detection +1

Paper
Code

MuRAL: Multi-Scale Region-based Active Learning for Object Detection

no code implementations • 29 Mar 2023 • Yi-Syuan Liou, Tsung-Han Wu, Jia-Fong Yeh, Wen-Chin Chen, Winston H. Hsu

MuRAL identifies informative regions of various scales to reduce annotation costs for well-learned objects and improve training performance.

Active Learning Object +2

Paper
Add Code

Free-form 3D Scene Inpainting with Dual-stream GAN

1 code implementation • 16 Dec 2022 • Ru-Fen Jheng, Tsung-Han Wu, Jia-Fong Yeh, Winston H. Hsu

Thus, we present a novel task named free-form 3D scene inpainting.

Paper
Code

Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling

1 code implementation • 8 Oct 2022 • Hsin-Ying Lee, Hung-Ting Su, Bing-Chen Tsai, Tsung-Han Wu, Jia-Fong Yeh, Winston H. Hsu

While recent large-scale video-language pre-training made great progress in video question answering, the design of spatial modeling of video-language models is less fine-grained than that of image-language models; existing practices of temporal modeling also suffer from weak and noisy alignment between modalities.

Language Modelling Question Answering +1

Paper
Code

CrossDTR: Cross-view and Depth-guided Transformers for 3D Object Detection

1 code implementation • 27 Sep 2022 • Ching-Yu Tseng, Yi-Rong Chen, Hsin-Ying Lee, Tsung-Han Wu, Wen-Chin Chen, Winston H. Hsu

To achieve accurate 3D object detection at a low cost for autonomous driving, many multi-camera methods have been proposed and solved the occlusion problem of monocular approaches.

3D Object Detection Autonomous Driving +5

Paper
Code

Fair Robust Active Learning by Joint Inconsistency

no code implementations • 22 Sep 2022 • Tsung-Han Wu, Hung-Ting Su, Shang-Tse Chen, Winston H. Hsu

Fairness and robustness play vital roles in trustworthy machine learning.

Active Learning Adversarial Attack +2

Paper
Add Code

MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer

1 code implementation • CVPR 2022 • Kuan-Chih Huang, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu

Moreover, different from conventional pixel-wise positional encodings, we introduce a novel depth positional encoding (DPE) to inject depth positional hints into transformers.

Autonomous Driving Monocular 3D Object Detection +2

121

Paper
Code

D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation

1 code implementation • 14 Feb 2022 • Tsung-Han Wu, Yi-Syuan Liou, Shao-Ji Yuan, Hsin-Ying Lee, Tung-I Chen, Kuan-Chih Huang, Winston H. Hsu

In the field of domain adaptation, a trade-off exists between the model performance and the number of target domain annotations.

Active Learning Domain Adaptation +2

Paper
Code

Anomaly-Aware Semantic Segmentation by Leveraging Synthetic-Unknown Data

no code implementations • 29 Nov 2021 • Guan-Rong Lu, Yueh-Cheng Liu, Tung-I Chen, Hung-Ting Su, Tsung-Han Wu, Winston H. Hsu

We design a new Masked Gradient Update (MGU) module to generate auxiliary data along the boundary of in-distribution data points.

Anomaly Detection Autonomous Driving +3

Paper
Add Code

ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation

1 code implementation • ICCV 2021 • Tsung-Han Wu, Yueh-Cheng Liu, Yu-Kai Huang, Hsin-Ying Lee, Hung-Ting Su, Ping-Chia Huang, Winston H. Hsu

Despite the success of deep learning on supervised point cloud semantic segmentation, obtaining large-scale point-by-point manual annotations is still a significant challenge.

Active Learning Scene Understanding +1

Paper
Code

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

no code implementations • CVPR 2021 • Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Yu-Cheng Chang, Tsung-Lin Tsou, Yu-An Wang, Winston H. Hsu

Dense depth estimation plays a key role in multiple applications such as robotics, 3D reconstruction, and augmented reality.

3D Reconstruction Depth Estimation

Paper
Add Code

$S^3$: Learnable Sparse Signal Superdensity for Guided Depth Estimation

no code implementations • 3 Mar 2021 • Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Yu-Cheng Chang, Tsung-Lin Tsou, Yu-An Wang, Winston H. Hsu

Dense depth estimation plays a key role in multiple applications such as robotics, 3D reconstruction, and augmented reality.

3D Reconstruction Depth Estimation

Paper
Add Code

AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization

1 code implementation • 31 Oct 2020 • Yen-Hao Chen, Da-Yi Wu, Tsung-Han Wu, Hung-Yi Lee

With a proper activation as an information bottleneck on content embeddings, the trade-off between the synthesis quality and the speaker similarity of the converted speech is improved drastically.

Audio and Speech Processing Sound

104

Paper
Code

Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers

no code implementations • 9 Jun 2020 • Tsung-Han Wu, Chun-Chen Hsieh, Yen-Hao Chen, Po-Han Chi, Hung-Yi Lee

In this paper, we seek solutions for reducing the computation complexity of transformer-based models for speech representation learning.

General Classification Representation Learning

Paper
Add Code

Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation

3 code implementations • 18 May 2020 • Po-Han Chi, Pei-Hung Chung, Tsung-Han Wu, Chun-Cheng Hsieh, Yen-Hao Chen, Shang-Wen Li, Hung-Yi Lee

We use the representations with two downstream tasks, speaker identification, and phoneme classification.

Self-Supervised Learning Speaker Identification

2,092

Paper
Code

Expanding Sparse Guidance for Stereo Matching

no code implementations • 24 Apr 2020 • Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu

The performance of image based stereo estimation suffers from lighting variations, repetitive patterns and homogeneous appearance.

Domain Adaptation Stereo Matching

Paper
Add Code

BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT

no code implementations • 25 Jan 2020 • Wei-Tsung Kao, Tsung-Han Wu, Po-Han Chi, Chun-Cheng Hsieh, Hung-Yi Lee

Although Bidirectional Encoder Representations from Transformers (BERT) have achieved tremendous success in many natural language processing (NLP) tasks, it remains a black box.

Sentence

Paper
Add Code

Indoor Depth Completion with Boundary Consistency and Self-Attention

3 code implementations • 22 Aug 2019 • Yu-Kai Huang, Tsung-Han Wu, Yueh-Cheng Liu, Winston H. Hsu

We utilize self-attention mechanism, previously used in image inpainting fields, to extract more useful information in each layer of convolution so that the complete depth map is enhanced.

Ranked #2 on Depth Completion on Matterport3D

Depth Completion Depth Estimation +1

175

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.