Search Results for author: Yu-Chiang Frank Wang

Found 64 papers, 20 papers with code

Propagated Image Filtering

no code implementations • CVPR 2015 • Jen-Hao Rick Chang, Yu-Chiang Frank Wang

In this paper, we propose the propagation filter as a novel image filtering operator, with the goal of smoothing over neighboring image pixels while preserving image context like edges or textural regions.

Image Denoising

Paper
Add Code

Unsupervised Domain Adaptation With Imbalanced Cross-Domain Data

no code implementations • ICCV 2015 • Tzu Ming Harry Hsu, Wei Yu Chen, Cheng-An Hou, Yao-Hung Hubert Tsai, Yi-Ren Yeh, Yu-Chiang Frank Wang

For standard unsupervised domain adaptation, one typically obtains labeled data in the source domain and only observes unlabeled data in the target domain.

General Classification Unsupervised Domain Adaptation

Paper
Add Code

Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation

no code implementations • CVPR 2016 • Yao-Hung Hubert Tsai, Yi-Ren Yeh, Yu-Chiang Frank Wang

With the goal of deriving a domain-invariant feature subspace for HDA, our CDLS is able to identify representative cross-domain data, including the unlabeled ones in the target domain, for performing adaptation.

Domain Adaptation

Paper
Add Code

No More Discrimination: Cross City Adaptation of Road Scene Segmenters

9 code implementations • ICCV 2017 • Yi-Hsin Chen, Wei-Yu Chen, Yu-Ting Chen, Bo-Cheng Tsai, Yu-Chiang Frank Wang, Min Sun

Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases.

Segmentation Semantic Segmentation

839

Paper
Code

Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation

no code implementations • CVPR 2018 • Yen-Cheng Liu, Yu-Ying Yeh, Tzu-Chien Fu, Sheng-De Wang, Wei-Chen Chiu, Yu-Chiang Frank Wang

While representation learning aims to derive interpretable features for describing visual data, representation disentanglement further results in such features so that particular image attributes can be identified and manipulated.

Attribute Disentanglement +2

Paper
Add Code

Generative-Discriminative Variational Model for Visual Recognition

no code implementations • 7 Jun 2017 • Chih-Kuan Yeh, Yao-Hung Hubert Tsai, Yu-Chiang Frank Wang

In other words, our GDVM casts the supervised learning task as a generative learning process, with data discrimination to be jointly exploited for improved classification.

Classification General Classification +3

Paper
Add Code

Learning Deep Latent Spaces for Multi-Label Classification

1 code implementation • 3 Jul 2017 • Chih-Kuan Yeh, Wei-Chieh Wu, Wei-Jen Ko, Yu-Chiang Frank Wang

Multi-label classification is a practical yet challenging task in machine learning related fields, since it requires the prediction of more than one label category for each input instance.

Classification General Classification +1

Paper
Code

Order-Free RNN with Visual Attention for Multi-Label Classification

1 code implementation • 18 Jul 2017 • Shang-Fu Chen, Yi-Chen Chen, Chih-Kuan Yeh, Yu-Chiang Frank Wang

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification.

Classification General Classification +2

Paper
Code

Multi-Label Zero-Shot Learning with Structured Knowledge Graphs

1 code implementation • CVPR 2018 • Chung-Wei Lee, Wei Fang, Chih-Kuan Yeh, Yu-Chiang Frank Wang

In this paper, we propose a novel deep learning architecture for multi-label zero-shot learning (ML-ZSL), which is able to predict multiple unseen class labels for each input instance.

General Classification Knowledge Graphs +3

Paper
Code

Summarizing First-Person Videos from Third Persons' Points of Views

no code implementations • ECCV 2018 • Hsuan-I Ho, Wei-Chen Chiu, Yu-Chiang Frank Wang

Video highlight or summarization is among interesting topics in computer vision, which benefits a variety of applications like viewing, searching, or storage.

Paper
Add Code

Adaptation and Re-Identification Network: An Unsupervised Deep Transfer Learning Approach to Person Re-Identification

no code implementations • 25 Apr 2018 • Yu-Jhe Li, Fu-En Yang, Yen-Cheng Liu, Yu-Ying Yeh, Xiaofei Du, Yu-Chiang Frank Wang

Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras.

Ranked #19 on Unsupervised Domain Adaptation on Duke to Market

Person Re-Identification Transfer Learning +1

Paper
Add Code

Deep Reinforcement Learning for Playing 2.5D Fighting Games

4 code implementations • 5 May 2018 • Yu-Jhe Li, Hsin-Yu Chang, Yu-Jing Lin, Po-Wei Wu, Yu-Chiang Frank Wang

Deep reinforcement learning has shown its success in game playing.

OpenAI Gym reinforcement-learning +1

Paper
Code

Deep Generative Models for Weakly-Supervised Multi-Label Classification

no code implementations • ECCV 2018 • Hong-Min Chu, Chih-Kuan Yeh, Yu-Chiang Frank Wang

In order to train learning models for multi-label classification (MLC), it is typically desirable to have a large amount of fully annotated multi-label data.

Classification General Classification +1

Paper
Add Code

Summarizing First-Person Videos from Third Persons' Points of View

no code implementations • ECCV 2018 • Hsuan-I Ho, Wei-Chen Chiu, Yu-Chiang Frank Wang

Video highlight or summarization is among interesting topics in computer vision, which benefits a variety of applications like viewing, searching, or storage.

Paper
Add Code

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

1 code implementation • NeurIPS 2018 • Alexander H. Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang

We present a novel and unified deep learning framework which is capable of learning domain-invariant representation from data across multiple domains.

Translation Unsupervised Domain Adaptation

135

Paper
Code

3D Shape Reconstruction from a Single 2D Image via 2D-3D Self-Consistency

no code implementations • 29 Nov 2018 • Yi-Lun Liao, Yao-Cheng Yang, Yu-Chiang Frank Wang

Aiming at inferring 3D shapes from 2D images, 3D shape reconstruction has drawn huge attention from researchers in computer vision and deep learning communities.

3D Reconstruction 3D Shape Reconstruction From A Single 2D Image +1

Paper
Add Code

Dual-modality seq2seq network for audio-visual event localization

2 code implementations • 20 Feb 2019 • Yan-Bo Lin, Yu-Jhe Li, Yu-Chiang Frank Wang

Audio-visual event localization requires one to identify theevent which is both visible and audible in a video (eitherat a frame or video level).

audio-visual event localization

158

Paper
Code

A Closer Look at Few-shot Classification

13 code implementations • ICLR 2019 • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

Few-shot classification aims to learn a classifier to recognize unseen classes during training with limited labeled examples.

Ranked #4 on Few-Shot Image Classification on Dirichlet CUB-200 (5-way, 5-shot)

Domain Generalization Few-Shot Image Classification +2

1,109

Paper
Code

Learning Resolution-Invariant Deep Representations for Person Re-Identification

no code implementations • 25 Jul 2019 • Yun-Chun Chen, Yu-Jhe Li, Xiaofei Du, Yu-Chiang Frank Wang

Moreover, the extension of our model for semi-supervised re-ID further confirms the scalability of our proposed method for real-world scenarios and applications.

Image Super-Resolution Person Re-Identification

Paper
Add Code

Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification

1 code implementation • 5 Aug 2019 • Chih-Ting Liu, Chih-Wei Wu, Yu-Chiang Frank Wang, Shao-Yi Chien

Video-based person re-identification (Re-ID) aims at matching video sequences of pedestrians across non-overlapping cameras.

Ranked #11 on Person Re-Identification on MARS

Video-Based Person Re-Identification

140

Paper
Code

Recover and Identify: A Generative Dual Model for Cross-Resolution Person Re-Identification

no code implementations • ICCV 2019 • Yu-Jhe Li, Yun-Chun Chen, Yen-Yu Lin, Xiaofei Du, Yu-Chiang Frank Wang

Person re-identification (re-ID) aims at matching images of the same identity across camera views.

Generative Adversarial Network Person Re-Identification

Paper
Add Code

Cross-Dataset Person Re-Identification via Unsupervised Pose Disentanglement and Adaptation

no code implementations • ICCV 2019 • Yu-Jhe Li, Ci-Siang Lin, Yan-Bo Lin, Yu-Chiang Frank Wang

Person re-identification (re-ID) aims at recognizing the same person from images taken across different cameras.

Ranked #16 on Unsupervised Domain Adaptation on Market to Duke

Disentanglement Person Re-Identification +1

Paper
Add Code

Cross-Resolution Adversarial Dual Network for Person Re-Identification and Beyond

no code implementations • 19 Feb 2020 • Yu-Jhe Li, Yun-Chun Chen, Yen-Yu Lin, Yu-Chiang Frank Wang

Person re-identification (re-ID) aims at matching images of the same person across camera views.

Generative Adversarial Network Person Re-Identification

Paper
Add Code

Transforming Multi-Concept Attention into Video Summarization

no code implementations • 2 Jun 2020 • Yen-Ting Liu, Yu-Jhe Li, Yu-Chiang Frank Wang

Video summarization is among challenging tasks in computer vision, which aims at identifying highlight frames or shots over a lengthy video input.

Video Summarization

Paper
Add Code

Wavelet Channel Attention Module with a Fusion Network for Single Image Deraining

no code implementations • 17 Jul 2020 • Hao-Hsiang Yang, Chao-Han Huck Yang, Yu-Chiang Frank Wang

Wavelet transform and the inverse wavelet transform are substituted for down-sampling and up-sampling so feature maps from the wavelet transform and convolutions contain different frequencies and scales.

Single Image Deraining

Paper
Add Code

Learning to Learn in a Semi-Supervised Fashion

no code implementations • ECCV 2020 • Yun-Chun Chen, Chao-Te Chou, Yu-Chiang Frank Wang

To address semi-supervised learning from both labeled and unlabeled data, we present a novel meta-learning scheme.

Image Retrieval Meta-Learning +3

Paper
Add Code

Semantics-Guided Clustering with Deep Progressive Learning for Semi-Supervised Person Re-identification

no code implementations • 2 Oct 2020 • Chih-Ting Liu, Yu-Jhe Li, Shao-Yi Chien, Yu-Chiang Frank Wang

As a result, our approach is able to augment the labeled training data in the semi-supervised setting.

Clustering Image Retrieval +2

Paper
Add Code

Domain Generalized Person Re-Identification via Cross-Domain Episodic Learning

no code implementations • 19 Oct 2020 • Ci-Siang Lin, Yuan-Chia Cheng, Yu-Chiang Frank Wang

That is, while a number of labeled source-domain datasets are available, we do not have access to any target-domain training data.

Domain Generalization Generalizable Person Re-identification +1

Paper
Add Code

Semantics-Guided Representation Learning with Applications to Visual Synthesis

no code implementations • 21 Oct 2020 • Jia-Wei Yan, Ci-Siang Lin, Fu-En Yang, Yu-Jhe Li, Yu-Chiang Frank Wang

Learning interpretable and interpolatable latent representations has been an emerging research direction, allowing researchers to understand and utilize the derived latent space for further applications such as visual synthesis or recognition.

Representation Learning

Paper
Add Code

Representation Decomposition for Image Manipulation and Beyond

no code implementations • 2 Nov 2020 • Shang-Fu Chen, Jia-Wei Yan, Ya-Fan Su, Yu-Chiang Frank Wang

Representation disentanglement aims at learning interpretable features, so that the output can be recovered or manipulated accordingly.

Attribute Disentanglement +1

Paper
Add Code

LayoutTransformer: Relation-Aware Scene Layout Generation

no code implementations • 1 Jan 2021 • Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang

In the areas of machine learning and computer vision, text-to-image synthesis aims at producing image outputs given the input text.

Image Generation Object +1

Paper
Add Code

Dual-MTGAN: Stochastic and Deterministic Motion Transfer for Image-to-Video Synthesis

no code implementations • 26 Feb 2021 • Fu-En Yang, Jing-Cheng Chang, Yuan-Hao Lee, Yu-Chiang Frank Wang

Generating videos with content and motion variations is a challenging task in computer vision.

Video Generation

Paper
Add Code

Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation

no code implementations • 3 May 2021 • Yan-Bo Lin, Yu-Chiang Frank Wang

Human perceives rich auditory experience with distinct sound heard by ears.

Audio Generation Self-Supervised Learning

Paper
Add Code

LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity

1 code implementation • CVPR 2021 • Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang

To better exploit the text input, so that implicit objects or relationships can be properly inferred during layout generation, we propose a LayoutTransformer Network (LT-Net) in this paper.

Paper
Code

Learning Visual-Linguistic Adequacy, Fidelity, and Fluency for Novel Object Captioning

no code implementations • 29 Sep 2021 • Cheng-Fu Yang, Yao-Hung Hubert Tsai, Wan-Cyuan Fan, Yu-Chiang Frank Wang, Louis-Philippe Morency, Ruslan Salakhutdinov

Novel object captioning (NOC) learns image captioning models for describing objects or visual concepts which are unseen (i. e., novel) in the training captions.

Image Captioning

Paper
Add Code

A Pixel-Level Meta-Learner for Weakly Supervised Few-Shot Semantic Segmentation

no code implementations • 2 Nov 2021 • Yuan-Hao Lee, Fu-En Yang, Yu-Chiang Frank Wang

Few-shot semantic segmentation addresses the learning task in which only few images with ground truth pixel-level labels are available for the novel classes of interest.

Few-Shot Semantic Segmentation Meta-Learning +2

Paper
Add Code

Adversarial Teacher-Student Representation Learning for Domain Generalization

1 code implementation • NeurIPS 2021 • Fu-En Yang, Yuan-Chia Cheng, Zu-Yun Shiau, Yu-Chiang Frank Wang

Domain generalization (DG) aims to transfer the learning task from a single or multiple source domains to unseen target domains.

Data Augmentation Domain Generalization +3

Paper
Code

Meta-Learned Feature Critics for Domain Generalized Semantic Segmentation

no code implementations • 27 Dec 2021 • Zu-Yun Shiau, Wei-Wei Lin, Ci-Siang Lin, Yu-Chiang Frank Wang

How to handle domain shifts when recognizing or segmenting visual data across domains has been studied by learning and vision communities.

Disentanglement Domain Generalization +3

Paper
Add Code

Few-Shot Classification in Unseen Domains by Episodic Meta-Learning Across Visual Domains

no code implementations • 27 Dec 2021 • Yuan-Chia Cheng, Ci-Siang Lin, Fu-En Yang, Yu-Chiang Frank Wang

Few-shot classification aims to carry out classification given only few labeled examples for the categories of interest.

Classification Few-Shot Learning +2

Paper
Add Code

Domain-Generalized Textured Surface Anomaly Detection

no code implementations • 23 Mar 2022 • Shang-Fu Chen, Yu-Min Liu, Chia-Ching Lin, Trista Pei-Chun Chen, Yu-Chiang Frank Wang

By observing normal and abnormal surface data across multiple source domains, our model is expected to be generalized to an unseen textured surface of interest, in which only a small number of normal data can be observed during testing.

Ranked #2 on Anomaly Detection on MVTec AD Textures Domain Generalization

Anomaly Detection Domain Generalization +1

Paper
Add Code

NeurMiPs: Neural Mixture of Planar Experts for View Synthesis

1 code implementation • CVPR 2022 • Zhi-Hao Lin, Wei-Chiu Ma, Hao-Yu Hsu, Yu-Chiang Frank Wang, Shenlong Wang

We present Neural Mixtures of Planar Experts (NeurMiPs), a novel planar-based scene representation for modeling geometry and appearance.

Novel View Synthesis

113

Paper
Code

Scene Graph Expansion for Semantics-Guided Image Outpainting

no code implementations • CVPR 2022 • Chiao-An Yang, Cheng-Yo Tan, Wan-Cyuan Fan, Cheng-Fu Yang, Meng-Lin Wu, Yu-Chiang Frank Wang

In particular, we propose a novel network of Scene Graph Transformer (SGT), which is designed to take node and edge features as inputs for modeling the associated structural information.

Image Outpainting

Paper
Add Code

Learning Facial Liveness Representation for Domain Generalized Face Anti-spoofing

no code implementations • 16 Aug 2022 • Zih-Ching Chen, Lin-Hsi Tsao, Chin-Lun Fu, Shang-Fu Chen, Yu-Chiang Frank Wang

Face anti-spoofing (FAS) aims at distinguishing face spoof attacks from the authentic ones, which is typically approached by learning proper models for performing the associated classification task.

Face Anti-Spoofing

Paper
Add Code

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

1 code implementation • 29 Aug 2022 • Wan-Cyuan Fan, Yen-Chun Chen, Dongdong Chen, Yu Cheng, Lu Yuan, Yu-Chiang Frank Wang

Diffusion models (DMs) have shown great potential for high-quality image synthesis.

Conditional Image Generation Denoising +1

111

Paper
Code

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond

1 code implementation • 30 Aug 2022 • Cheng-Yen Hsieh, Chih-Jung Chang, Fu-En Yang, Yu-Chiang Frank Wang

In particular, we present a cross-scale patch-level correlation learning in SS-PRL, which allows the model to aggregate and associate information learned across patch scales.

Instance Segmentation Multi-Label Classification +5

Paper
Code

Paraphrasing Is All You Need for Novel Object Captioning

no code implementations • 25 Sep 2022 • Cheng-Fu Yang, Yao-Hung Hubert Tsai, Wan-Cyuan Fan, Ruslan Salakhutdinov, Louis-Philippe Morency, Yu-Chiang Frank Wang

Since no ground truth captions are available for novel object images during training, our P2C leverages cross-modality (image-text) association modules to ensure the above caption characteristics can be properly preserved.

Language Modelling Object

Paper
Add Code

Target-Free Text-guided Image Manipulation

no code implementations • 26 Nov 2022 • Wan-Cyuan Fan, Cheng-Fu Yang, Chiao-An Yang, Yu-Chiang Frank Wang

We tackle the problem of target-free text-guided image manipulation, which requires one to modify the input reference image based on the given text instruction, while no ground truth target image is observed during training.

counterfactual Image Manipulation

Paper
Add Code

Bias-Eliminating Augmentation Learning for Debiased Federated Learning

no code implementations • CVPR 2023 • Yuan-Yi Xu, Ci-Siang Lin, Yu-Chiang Frank Wang

Learning models trained on biased datasets tend to observe correlations between categorical and undesirable features, which result in degraded performances.

Federated Learning Image Classification

Paper
Add Code

TAX: Tendency-and-Assignment Explainer for Semantic Segmentation with Multi-Annotators

no code implementations • 19 Feb 2023 • Yuan-Chia Cheng, Zu-Yun Shiau, Fu-En Yang, Yu-Chiang Frank Wang

In this paper, we present a learning framework of Tendency-and-Assignment Explainer (TAX), designed to offer interpretability at the annotator and assignment levels.

Segmentation Semantic Segmentation

Paper
Add Code

QuAVF: Quality-aware Audio-Visual Fusion for Ego4D Talking to Me Challenge

1 code implementation • 30 Jun 2023 • Hsi-Che Lin, Chien-Yi Wang, Min-Hung Chen, Szu-Wei Fu, Yu-Chiang Frank Wang

This technical report describes our QuAVF@NTU-NVIDIA submission to the Ego4D Talking to Me (TTM) Challenge 2023.

Paper
Code

FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning

1 code implementation • 19 Jul 2023 • Chia-Hsiang Kao, Yu-Chiang Frank Wang

In this paper, we propose FedBug (Federated Learning with Bottom-Up Gradual Unfreezing), a novel FL framework designed to effectively mitigate client drift.

Federated Learning

Paper
Code

Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation

no code implementations • ICCV 2023 • Fu-En Yang, Chien-Yi Wang, Yu-Chiang Frank Wang

To leverage robust representations from large-scale models while enabling efficient model personalization for heterogeneous clients, we propose a novel personalized FL framework of client-specific Prompt Generation (pFedPG), which learns to deploy a personalized prompt generator at the server for producing client-specific visual prompts that efficiently adapts frozen backbones to local data distributions.

Federated Learning

Paper
Add Code

Frequency-Aware Self-Supervised Long-Tailed Learning

no code implementations • 9 Sep 2023 • Ci-Siang Lin, Min-Hung Chen, Yu-Chiang Frank Wang

Data collected from the real world typically exhibit long-tailed distributions, where frequent classes contain abundant data while rare ones have only a limited number of samples.

Self-Supervised Learning

Paper
Add Code

LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following

1 code implementation • 18 Oct 2023 • Cheng-Fu Yang, Yen-Chun Chen, Jianwei Yang, Xiyang Dai, Lu Yuan, Yu-Chiang Frank Wang, Kai-Wei Chang

Additional analysis shows that the contrastive objective and meta-actions are complementary in achieving the best results, and the resulting agent better aligns its states with corresponding instructions, making it more suitable for real-world embodied agents.

Contrastive Learning Instruction Following

Paper
Code

Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers

no code implementations • 29 Nov 2023 • Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang

The former refrains the model from producing images associated with the target concept for any paraphrased or learned prompts, while the latter preserves its ability in generating images with non-target concepts.

Paper
Add Code

Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction

no code implementations • 30 Nov 2023 • Cheng Sun, Wei-En Tai, Yu-Lin Shih, Kuan-Wei Chen, Yong-Jing Syu, Kent Selwyn The, Yu-Chiang Frank Wang, Hwann-Tzong Chen

State-of-the-art single-view 360-degree room layout reconstruction methods formulate the problem as a high-level 1D (per-column) regression task.

Benchmarking regression

Paper
Add Code

TPA3D: Triplane Attention for Fast Text-to-3D Generation

no code implementations • 5 Dec 2023 • Hong-En Chen, Bin-Shih Wu, Sheng-Yu Huang, Yu-Chiang Frank Wang

With only 3D shape data and their rendered 2D images observed during training, our TPA3D is designed to retrieve detailed visual descriptions for synthesizing the corresponding 3D mesh data.

3D Generation Sentence +1

Paper
Add Code

Language-Guided Transformer for Federated Multi-Label Classification

1 code implementation • 12 Dec 2023 • I-Jieh Liu, Ci-Siang Lin, Fu-En Yang, Yu-Chiang Frank Wang

Nevertheless, it is still challenging for FL to deal with user heterogeneity in their local data distribution in the real-world FL scenario, and this issue becomes even more severe in multi-label image classification.

Classification Federated Learning +3

Paper
Code

SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation

no code implementations • 22 Jan 2024 • Ci-Siang Lin, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen

In this way, SemPLeS can perform better semantic alignment between object regions and the associated class labels, resulting in desired pseudo masks for training the segmentation model.

Ranked #1 on Weakly-Supervised Semantic Segmentation on PASCAL VOC 2012 test

Object Segmentation +2

Paper
Add Code

DoRA: Weight-Decomposed Low-Rank Adaptation

4 code implementations • 14 Feb 2024 • Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, Min-Hung Chen

By employing DoRA, we enhance both the learning capacity and training stability of LoRA while avoiding any additional inference overhead.

261

Paper
Code

Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech

1 code implementation • 26 Feb 2024 • Szu-Wei Fu, Kuo-Hsuan Hung, Yu Tsao, Yu-Chiang Frank Wang

To improve the robustness of the encoder for SE, a novel self-distillation mechanism combined with adversarial training is introduced.

Quantization Speech Enhancement

Paper
Code

GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

no code implementations • 6 Mar 2024 • Zi-Ting Chou, Sheng-Yu Huang, I-Jieh Liu, Yu-Chiang Frank Wang

Utilizing multi-view inputs to synthesize novel-view images, Neural Radiance Fields (NeRF) have emerged as a popular research topic in 3D vision.

Scene Understanding Semantic Segmentation

Paper
Add Code

Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models

no code implementations • 14 Mar 2024 • Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen, Kai-Po Chang, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang

Large-scale vision-language models (VLMs) have shown a strong zero-shot generalization capability on unseen-domain data.

Continual Learning Knowledge Distillation +3

Paper
Add Code

DOrA: 3D Visual Grounding with Order-Aware Referring

no code implementations • 25 Mar 2024 • Tung-Yu Wu, Sheng-Yu Huang, Yu-Chiang Frank Wang

3D visual grounding aims to identify the target object within a 3D point cloud scene referred to by a natural language description.

Visual Grounding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.