Search Results for author: Min-Hung Chen

Found 30 papers, 15 papers with code

Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation

no code implementations • 5 Apr 2024 • Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang, Chun-Pei Chen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Yung-Yu Chuang, Yen-Yu Lin

This paper addresses text-supervised semantic segmentation, aiming to learn a model capable of segmenting arbitrary visual concepts within images by using only image-text pairs without dense annotations.

Contrastive Learning Language Modelling +3

Paper
Add Code

DoRA: Weight-Decomposed Low-Rank Adaptation

4 code implementations • 14 Feb 2024 • Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, Min-Hung Chen

By employing DoRA, we enhance both the learning capacity and training stability of LoRA while avoiding any additional inference overhead.

259

Paper
Code

SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation

no code implementations • 22 Jan 2024 • Ci-Siang Lin, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen

In this way, SemPLeS can perform better semantic alignment between object regions and the associated class labels, resulting in desired pseudo masks for training the segmentation model.

Ranked #1 on Weakly-Supervised Semantic Segmentation on PASCAL VOC 2012 test

Object Segmentation +2

Paper
Add Code

PartDistill: 3D Shape Part Segmentation by Vision-Language Model Distillation

1 code implementation • 7 Dec 2023 • Ardian Umam, Cheng-Kun Yang, Min-Hung Chen, Jen-Hui Chuang, Yen-Yu Lin

This paper proposes a cross-modal distillation framework, PartDistill, which transfers 2D knowledge from vision-language models (VLMs) to facilitate 3D shape part segmentation.

3D Part Segmentation Language Modelling +1

Paper
Code

Conditional Modeling Based Automatic Video Summarization

no code implementations • 20 Nov 2023 • Jia-Hong Huang, Chao-Han Huck Yang, Pin-Yu Chen, Min-Hung Chen, Marcel Worring

The aim of video summarization is to shorten videos automatically while retaining the key information necessary to convey the overall story.

Video Summarization

Paper
Add Code

2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision

no code implementations • ICCV 2023 • Cheng-Kun Yang, Min-Hung Chen, Yung-Yu Chuang, Yen-Yu Lin

Considering the high annotation cost of point clouds, effective 2D and 3D feature fusion based on weakly supervised learning is in great demand.

Point Cloud Segmentation Segmentation +1

Paper
Add Code

Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi-Sensor Kalman Filter

2 code implementations • 26 Sep 2023 • Hsu-kuang Chiu, Chien-Yi Wang, Min-Hung Chen, Stephen F. Smith

However, their proposed methods mainly use cooperative detection results as input to a standard single-sensor Kalman Filter-based tracking algorithm.

3D Multi-Object Tracking Autonomous Driving

369

Paper
Code

Frequency-Aware Self-Supervised Long-Tailed Learning

no code implementations • 9 Sep 2023 • Ci-Siang Lin, Min-Hung Chen, Yu-Chiang Frank Wang

Data collected from the real world typically exhibit long-tailed distributions, where frequent classes contain abundant data while rare ones have only a limited number of samples.

Self-Supervised Learning

Paper
Add Code

Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction

no code implementations • ICCV 2023 • Su-Kai Chen, Hung-Lin Yen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Wen-Hsiao Peng, Yen-Yu Lin

To address this, we propose the continuous exposure value representation (CEVR), which uses an implicit function to generate LDR images with arbitrary EVs, including those unseen during training.

HDR Reconstruction

Paper
Add Code

QuAVF: Quality-aware Audio-Visual Fusion for Ego4D Talking to Me Challenge

1 code implementation • 30 Jun 2023 • Hsi-Che Lin, Chien-Yi Wang, Min-Hung Chen, Szu-Wei Fu, Yu-Chiang Frank Wang

This technical report describes our QuAVF@NTU-NVIDIA submission to the Ego4D Talking to Me (TTM) Challenge 2023.

Paper
Code

A Closer Look at Geometric Temporal Dynamics for Face Anti-Spoofing

no code implementations • 25 Jun 2023 • Chih-Jung Chang, Yaw-Chern Lee, Shih-Hsuan Yao, Min-Hung Chen, Chien-Yi Wang, Shang-Hong Lai, Trista Pei-Chun Chen

Face anti-spoofing (FAS) is indispensable for a face recognition system.

Face Anti-Spoofing Face Recognition

Paper
Add Code

Causalainer: Causal Explainer for Automatic Video Summarization

no code implementations • 30 Apr 2023 • Jia-Hong Huang, Chao-Han Huck Yang, Pin-Yu Chen, Min-Hung Chen, Marcel Worring

In this work, a Causal Explainer, dubbed Causalainer, is proposed to address this issue.

Video Summarization

Paper
Add Code

Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection

1 code implementation • 10 Apr 2023 • Wei-Jhe Huang, Jheng-Hsien Yeh, Min-Hung Chen, Gueter Josmy Faure, Shang-Hong Lai

Finally, we calculate the similarity between the interaction feature and the text feature for each label to determine the action category.

Action Detection Language Modelling +1

Paper
Code

Kinship Representation Learning with Face Componential Relation

no code implementations • 10 Apr 2023 • Weng-Tai Su, Min-Hung Chen, Chien-Yi Wang, Shang-Hong Lai, Trista Pei-Chun Chen

Kinship recognition aims to determine whether the subjects in two facial images are kin or non-kin, which is an emerging and challenging problem.

Relation Relation Network +1

Paper
Add Code

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

no code implementations • 8 Nov 2022 • Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity.

Paper
Add Code

PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

1 code implementation • 8 Nov 2022 • Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc van Gool

The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations.

Paper
Code

Holistic Interaction Transformer Network for Action Detection

1 code implementation • 23 Oct 2022 • Gueter Josmy Faure, Min-Hung Chen, Shang-Hong Lai

Actions are about how we interact with the environment, including other people, objects, and ourselves.

Ranked #1 on Action Detection on MultiSports

Action Recognition Fine-Grained Action Detection +1

Paper
Code

Self-Supervised Robustifying Guidance for Monocular 3D Face Reconstruction

1 code implementation • 29 Dec 2021 • Hitika Tiwari, Min-Hung Chen, Yi-Min Tsai, Hsien-Kai Kuo, Hung-Jen Chen, Kevin Jou, K. S. Venkatesh, Yong-Sheng Chen

Therefore, we propose a Self-Supervised RObustifying GUidancE (ROGUE) framework to obtain robustness against occlusions and noise in the face images.

3D Face Reconstruction

Paper
Code

Learned Smartphone ISP on Mobile NPUs with Deep Learning, Mobile AI 2021 Challenge: Report

2 code implementations • 17 May 2021 • Andrey Ignatov, Cheng-Ming Chiang, Hsien-Kai Kuo, Anastasia Sycheva, Radu Timofte, Min-Hung Chen, Man-Yu Lee, Yu-Syuan Xu, Yu Tseng, Shusong Xu, Jin Guo, Chao-Hung Chen, Ming-Chun Hsyu, Wen-Chia Tsai, Chao-Wei Chen, Grigory Malivenko, Minsu Kwon, Myungje Lee, Jaeyoon Yoo, Changbeom Kang, Shinjo Wang, Zheng Shaolong, Hao Dejun, Xie Fen, Feng Zhuang, Yipeng Ma, Jingyang Peng, Tao Wang, Fenglong Song, Chih-Chung Hsu, Kwan-Lin Chen, Mei-Hsuang Wu, Vishal Chudasama, Kalpesh Prajapati, Heena Patel, Anjali Sarvaiya, Kishor Upla, Kiran Raja, Raghavendra Ramachandra, Christoph Busch, Etienne de Stoutz

As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos.

283

Paper
Code

Network Space Search for Pareto-Efficient Spaces

no code implementations • 22 Apr 2021 • Min-Fong Hong, Hao-Yun Chen, Min-Hung Chen, Yu-Syuan Xu, Hsien-Kai Kuo, Yi-Min Tsai, Hung-Jen Chen, Kevin Jou

We propose an NSS method to directly search for efficient-aware network spaces automatically, reducing the manual effort and immense cost in discovering satisfactory ones.

Neural Architecture Search

Paper
Add Code

Action Segmentation with Mixed Temporal Domain Adaptation

no code implementations • 15 Apr 2021 • Min-Hung Chen, Baopu Li, Yingze Bao, Ghassan AlRegib

The main progress for action segmentation comes from densely-annotated data for fully-supervised learning.

Ranked #14 on Action Segmentation on Breakfast

Action Segmentation Domain Adaptation

Paper
Add Code

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

1 code implementation • CVPR 2020 • Min-Hung Chen, Baopu Li, Yingze Bao, Ghassan AlRegib, Zsolt Kira

Despite the recent progress of fully-supervised action segmentation techniques, the performance is still not fully satisfactory.

Ranked #11 on Action Segmentation on GTEA

Action Segmentation Domain Adaptation

154

Paper
Code

Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding

no code implementations • 6 Nov 2019 • Yi-Chieh Liu, Yung-An Hsieh, Min-Hung Chen, Chao-Han Huck Yang, Jesper Tegner, Yi-Chang James Tsai

Performing driving behaviors based on causal reasoning is essential to ensure driving safety.

Paper
Add Code

Traffic Sign Detection under Challenging Conditions: A Deeper Look Into Performance Variations and Spectral Characteristics

2 code implementations • 29 Aug 2019 • Dogancan Temel, Min-Hung Chen, Ghassan AlRegib

We investigate the effect of challenging conditions through spectral analysis and show that challenging conditions can lead to distinct magnitude spectrum characteristics.

Traffic Sign Detection Traffic Sign Recognition

Paper
Code

Temporal Attentive Alignment for Large-Scale Video Domain Adaptation

5 code implementations • ICCV 2019 • Min-Hung Chen, Zsolt Kira, Ghassan AlRegib, Jaekwon Yoo, Ruxin Chen, Jian Zheng

Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on four video DA datasets (e. g. 7. 9% accuracy gain over "Source only" from 73. 9% to 81. 8% on "HMDB --> UCF", and 10. 3% gain on "Kinetics --> Gameplay").

Ranked #3 on Unsupervised Domain Adaptation on Jester (Gesture Recognition)

Unsupervised Domain Adaptation

257

Paper
Code

Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding

no code implementations • 16 Jun 2019 • Jian Zheng, Sudha Krishnamurthy, Ruxin Chen, Min-Hung Chen, Zhenhao Ge, Xiaohua LI

However, little work has been done for game image captioning which has some unique characteristics and requirements.

Caption Generation Image Captioning +1

Paper
Add Code

Temporal Attentive Alignment for Video Domain Adaptation

5 code implementations • 26 May 2019 • Min-Hung Chen, Zsolt Kira, Ghassan AlRegib

Ranked #1 on Domain Adaptation on UCF-to-Olympic

Domain Adaptation

257

Paper
Code

Challenging Environments for Traffic Sign Detection: Reliability Assessment under Inclement Conditions

2 code implementations • 19 Feb 2019 • Dogancan Temel, Tariq Alshawi, Min-Hung Chen, Ghassan AlRegib

Experimental results show that benchmarked algorithms are highly sensitive to tested challenging conditions, which result in an average performance drop of 0. 17 in terms of precision and a performance drop of 0. 28 in recall under severe conditions.

Traffic Sign Detection

Paper
Code

TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition

4 code implementations • 30 Mar 2017 • Chih-Yao Ma, Min-Hung Chen, Zsolt Kira, Ghassan AlRegib

We demonstrate that using both RNNs (using LSTMs) and Temporal-ConvNets on spatiotemporal feature matrices are able to exploit spatiotemporal dynamics to improve the overall performance.

Ranked #54 on Action Recognition on UCF101

Action Classification Action Recognition +3

844

Paper
Code

Depth and Skeleton Associated Action Recognition without Online Accessible RGB-D Cameras

no code implementations • CVPR 2014 • Yen-Yu Lin, Ju-Hsuan Hua, Nick C. Tang, Min-Hung Chen, Hong-Yuan Mark Liao

Our approach aims to enhance action recognition in RGB videos by leveraging the extra database.

Action Recognition Temporal Action Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.