1 code implementation • 30 Aug 2024 • Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen, Hung-Ting Su, Shang-Hong Lai, Winston H. Hsu
Existing research often treats long-form videos as extended short videos, leading to several limitations: inadequate capture of long-range dependencies, inefficient processing of redundant information, and failure to extract high-level semantic concepts.
Video Classification zero-shot long video breakpoint-mode question answering +3
1 code implementation • 28 Aug 2024 • Yu-Hsuan Hsieh, Shang-Hong Lai
To improve logical anomaly detection, some previous works have integrated segmentation techniques with conventional anomaly detection methods.
Ranked #1 on Anomaly Detection on MVTec LOCO AD
no code implementations • 28 Aug 2024 • Wei-Jhe Huang, Min-Hung Chen, Shang-Hong Lai
In this paper, we aim to adapt the pretrained image-language models to detect unseen actions.
no code implementations • 15 Dec 2023 • Ho-Weng Lee, Shang-Hong Lai
The proposed anomaly backbone provides a foundation model for more precise anomaly detection and localization.
1 code implementation • 19 Sep 2023 • Jia Luo Peng, Keng Wei Chang, Shang-Hong Lai
Kinship verification is an emerging task in computer vision with multiple potential applications.
1 code implementation • ICCV 2023 • Cheng-Che Cheng, Min-Xuan Qiu, Chen-Kuo Chiang, Shang-Hong Lai
Experimental results show that the proposed graph model is able to extract more discriminating features for object tracking, and our model achieves state-of-the-art performance on several public datasets.
Ranked #4 on Multi-Object Tracking on Wildtrack
no code implementations • 25 Jun 2023 • Chih-Jung Chang, Yaw-Chern Lee, Shih-Hsuan Yao, Min-Hung Chen, Chien-Yi Wang, Shang-Hong Lai, Trista Pei-Chun Chen
Face anti-spoofing (FAS) is indispensable for a face recognition system.
1 code implementation • 10 Apr 2023 • Wei-Jhe Huang, Jheng-Hsien Yeh, Min-Hung Chen, Gueter Josmy Faure, Shang-Hong Lai
Finally, we calculate the similarity between the interaction feature and the text feature for each label to determine the action category.
no code implementations • 10 Apr 2023 • Weng-Tai Su, Min-Hung Chen, Chien-Yi Wang, Shang-Hong Lai, Trista Pei-Chun Chen
Kinship recognition aims to determine whether the subjects in two facial images are kin or non-kin, which is an emerging and challenging problem.
no code implementations • 29 Nov 2022 • Chu-Chun Chuang, Chien-Yi Wang, Shang-Hong Lai
With the increasing variations of face presentation attacks, model generalization becomes an essential challenge for a practical face anti-spoofing system.
1 code implementation • 28 Nov 2022 • Fu-En Wang, Chien-Yi Wang, Min Sun, Shang-Hong Lai
In this paper, we propose MixFairFace framework to improve the fairness in face recognition models.
1 code implementation • 23 Oct 2022 • Gueter Josmy Faure, Min-Hung Chen, Shang-Hong Lai
Actions are about how we interact with the environment, including other people, objects, and ourselves.
Ranked #1 on Action Detection on MultiSports
1 code implementation • 1 Apr 2022 • PoHao Hsu, Che-Tsung Lin, Chun Chet Ng, Jie-Long Kew, Mei Yih Tan, Shang-Hong Lai, Chee Seng Chan, Christopher Zach
Deep learning-based methods have made impressive progress in enhancing extremely low-light images - the image quality of the reconstructed images has generally improved.
no code implementations • CVPR 2022 • Wenbin Zhu, Chien-Yi Wang, Kuan-Lun Tseng, Shang-Hong Lai, Baoyuan Wang
Leveraging the environment-specific local data after the deployment of the initial global model, LaFR aims at getting optimal performance by training local-adapted models automatically and un-supervisely, as opposed to fixing their initial global model.
2 code implementations • CVPR 2022 • Chien-Yi Wang, Yu-Ding Lu, Shang-Ta Yang, Shang-Hong Lai
Previous works leverage auxiliary pixel-level supervision and domain generalization approaches to address unseen spoof types.
1 code implementation • 23 Dec 2021 • Chih-Ting Liu, Chien-Yi Wang, Shao-Yi Chien, Shang-Hong Lai
Current state-of-the-art deep learning based face recognition (FR) models require a large number of face identities for central training.
no code implementations • 22 Dec 2021 • Meng-Tzu Chiu, Hsun-Ying Cheng, Chien-Yi Wang, Shang-Hong Lai
Our DepthNet is used to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training an accurate RGB-D face recognition model.
no code implementations • 18 Oct 2021 • Yu-Chun Wang, Chien-Yi Wang, Shang-Hong Lai
Unlike previous FAS disentanglement works with one-stage architecture, we found that the dual-stage training design can improve the training stability and effectively encode the features to detect unseen attack types.
no code implementations • ECCV 2020 • Yu-Hui Lee, Shang-Hong Lai
In this paper, we propose a novel image-to-image GAN framework for eyeglasses removal, called ByeGlassesGAN, which is used to automatically detect the position of eyeglasses and then remove them from face images.
Multimedia
no code implementations • 11 Aug 2020 • Chien-Yi Wang, Ya-Liang Chang, Shang-Ta Yang, Dong Chen, Shang-Hong Lai
We propose a unified representation learning framework to address the Cross Model Compatibility (CMC) problem in the context of visual search applications.
no code implementations • ECCV 2018 • Sheng-Wei Huang, Che-Tsung Lin, Shu-Ping Chen, Yen-Yi Wu, Po-Hao Hsu, Shang-Hong Lai
Deep learning based image-to-image translation methods aim at learning the joint distribution of the two domains and finding transformations between them.
Ranked #31 on Link Prediction on WN18
no code implementations • CVPR 2015 • Hong-Ren Su, Shang-Hong Lai
Registration between images taken with different cameras, from different viewpoints or under different lighting conditions is a challenging problem.