2 code implementations • ECCV 2020 • Yun-Zhu Song, Zhi Rui Tam, Hung-Jen Chen, Huiao-Han Lu, Hong-Han Shuai
Different from video generation that focuses on maintaining the continuity of generated images (frames), story visualization emphasizes preserving the global consistency of characters and scenes across different story pictures, which is very challenging since story sentences only provide sparse signals for generating images.
Ranked #4 on
Story Visualization
on Pororo
(using extra training data)
no code implementations • 15 Jan 2025 • Jhe-Hao Lin, Yi Yao, Chan-Feng Hsu, HongXia Xie, Hong-Han Shuai, Wen-Huang Cheng
Knowledge distillation (KD) involves transferring knowledge from a pre-trained heavy teacher model to a lighter student model, thereby reducing the inference cost while maintaining comparable effectiveness.
1 code implementation • 16 Dec 2024 • Yu-Hsuan Huang, Ling Lo, HongXia Xie, Hong-Han Shuai, Wen-Huang Cheng
Sequential recommendation (SR) systems predict user preferences by analyzing time-ordered interaction sequences.
no code implementations • 20 Oct 2024 • Hao-Tang Tsui, Yu-Rou Tuan, Hong-Han Shuai
This can be solved in two ways, forward kinematics method and inverse kinematics method.
1 code implementation • 1 Oct 2024 • Chieh-Yun Chen, Chiang Tseng, Li-Wu Tsao, Hong-Han Shuai
In this paper, we share a comprehensive analysis of text embedding: i) how text embedding contributes to the generated images and ii) why information gets lost and biases towards the first-mentioned object.
1 code implementation • 25 Jul 2024 • Jian-Yu Jiang-Lin, Kang-Yang Huang, Ling Lo, Yi-Ning Huang, Terence Lin, Jhih-Ciang Wu, Hong-Han Shuai, Wen-Huang Cheng
Our model couples Latent Diffusion Models with Visual Language Models to refine the generation process, ensuring precise depictions of HOIs.
no code implementations • 17 Jul 2024 • Yi Yao, Chan-Feng Hsu, Jhe-Hao Lin, HongXia Xie, Terence Lin, Yi-Ning Huang, Hong-Han Shuai, Wen-Huang Cheng
In spite of recent advancements in text-to-image generation, limitations persist in handling complex and imaginative prompts due to the restricted diversity and complexity of training data.
2 code implementations • 9 Jun 2024 • Hou-I Liu, Yu-Wen Tseng, Kai-Cheng Chang, Pin-Jyun Wang, Hong-Han Shuai, Wen-Huang Cheng
Second, based on the two-stage framework, we replace the obsolete R-CNN detector with a novel Trans R-CNN detector to focus on the representation of tiny objects with self-attention.
Ranked #1 on
Object Detection
on AI-TOD
no code implementations • 31 May 2024 • Chien-Kun Huang, Yi-Ting Chang, Lun-Wei Ku, Cheng-Te Li, Hong-Han Shuai
This paper provides an overview of the Fake-EmoReact 2021 Challenge, held at the 9th SocialNLP Workshop, in conjunction with NAACL 2021.
1 code implementation • 21 May 2024 • Li-Yang Tseng, Tzu-Ling Lin, Hong-Han Shuai, Jen-Wei Huang, Wen-Whei Chang
Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity.
1 code implementation • CVPR 2024 • HongXia Xie, Chu-Jun Peng, Yu-Wen Tseng, Hung-Jen Chen, Chan-Feng Hsu, Hong-Han Shuai, Wen-Huang Cheng
Visual Instruction Tuning represents a novel learning paradigm involving the fine-tuning of pre-trained language models using task-specific instructions.
1 code implementation • 19 Apr 2024 • Teng-Fang Hsiao, Bo-Kai Ruan, Hong-Han Shuai
TF-GPH incorporates a novel ``Similarity Disentangle Mask'', which disentangles the foreground content and background image by redirecting their attention to corresponding reference images, enhancing the attention mechanism for multi-image inputs.
no code implementations • 8 Apr 2024 • Hou-I Liu, Marco Galindo, HongXia Xie, Lai-Kuan Wong, Hong-Han Shuai, Yung-Hui Li, Wen-Huang Cheng
Over the past decade, the dominance of deep learning has prevailed across various domains of artificial intelligence, including natural language processing, computer vision, and biomedical signal processing.
no code implementations • 7 Apr 2024 • Hou-I Liu, Christine Wu, Jen-Hao Cheng, Wenhao Chai, Shian-Yun Wang, Gaowen Liu, Jenq-Neng Hwang, Hong-Han Shuai, Wen-Huang Cheng
Subsequently, we introduce the cross-modal residual distillation to transfer the 3D spatial cues.
2 code implementations • 4 Apr 2024 • Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng
DQ-DETR uses the prediction and density maps from the categorical counting module to dynamically adjust the number of object queries and improve the positional information of queries.
no code implementations • 4 Mar 2024 • Zhi-Rui Tam, Ya-Ting Pai, Yen-Wei Lee, Jun-Da Chen, Wei-Min Chu, Sega Cheng, Hong-Han Shuai
We present TMMLU+, a new benchmark designed for Traditional Chinese language understanding.
no code implementations • CVPR 2024 • Ling Lo, Cheng Yu Yeo, Hong-Han Shuai, Wen-Huang Cheng
To address the concerns we propose an image immunization approach named semantic attack to protect our images from being manipulated by malicious agents using diffusion models.
no code implementations • ICCV 2023 • Yi-Syuan Chen, Yun-Zhu Song, Cheng Yu Yeo, Bei Liu, Jianlong Fu, Hong-Han Shuai
To this end, we raise a question: ``How can we enable in-context learning without relying on the intrinsic in-context ability of large language models?".
no code implementations • 27 Jun 2023 • Hung-Yun Chiang, Yi-Syuan Chen, Yun-Zhu Song, Hong-Han Shuai, Jason S. Chang
Review-Based Recommender Systems (RBRS) have attracted increasing research interest due to their ability to alleviate well-known cold-start problems.
no code implementations • 24 Mar 2023 • Yi-Syuan Chen, Yun-Zhu Song, Hong-Han Shuai
The generated summaries could therefore be constrained by the preference bias in the training set, especially under low-resource settings.
1 code implementation • ICCV 2023 • Chieh-Yun Chen, Yi-Chung Chen, Hong-Han Shuai, Wen-Huang Cheng
COTTON leverages clothing structure with landmarks and segmentation to design a novel landmark-guided transformation for precisely deforming clothes, allowing for size adjustment during try-on.
no code implementations • ICCV 2023 • HongXia Xie, Ming-Xian Lee, Tzu-Jui Chen, Hung-Jen Chen, Hou-I Liu, Hong-Han Shuai, Wen-Huang Cheng
Then, the Cross-Patch Attention module is proposed to fuse the features of MIP and global context together to complement each other.
no code implementations • 7 Nov 2022 • Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang, Xiao Sun, HaoDong Wu, Xuncheng Liu, Weizhan Zhang, Caixia Yan, Haipeng Du, Qinghua Zheng, Qi Wang, Wangdu Chen, Ran Duan, Mengdi Sun, Dan Zhu, Guannan Chen, Hojin Cho, Steve Kim, Shijie Yue, Chenghua Li, Zhengyang Zhuge, Wei Chen, Wenxu Wang, Yufeng Zhou, Xiaochen Cai, Hengxing Cai, Kele Xu, Li Liu, Zehua Cheng, Wenyi Lian, Wenjing Lian
While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices.
no code implementations • 7 Jul 2022 • Bo-Kai Ruan, Hong-Han Shuai, Wen-Huang Cheng
Transformers have achieved great success in natural language processing.
1 code implementation • NAACL 2022 • Yun-Zhu Song, Yi-Syuan Chen, Hong-Han Shuai
A notable challenge in Multi-Document Summarization (MDS) is the extremely-long length of the input.
1 code implementation • 2 Dec 2021 • Wei-Yao Wang, Hong-Han Shuai, Kai-Shiang Chang, Wen-Chih Peng
The increasing demand for analyzing the insights in sports has stimulated a line of productive studies from a variety of perspectives, e. g., health state monitoring, outcome prediction.
1 code implementation • ICCV 2021 • Chin-Yuan Yeh, Hsi-Wen Chen, Hong-Han Shuai, De-Nian Yang, Ming-Syan Chen
To improve efficiency, we introduce the limit-aware random gradient-free estimation and the gradient sliding mechanism to estimate the gradient that adheres to the adversarial limit, i. e., the pixel value limitations of the adversarial example.
no code implementations • 5 Oct 2021 • Hsu-Chao Lai, Jui-Yi Tsai, Hong-Han Shuai, Jiun-Long Huang, Wang-Chien Lee, De-Nian Yang
In contrast to traditional online videos, live multi-streaming supports real-time social interactions between multiple streamers and viewers, such as donations.
no code implementations • 1 Oct 2021 • Chun-Wei Yang, Thanh-Hai Phung, Hong-Han Shuai, Wen-Huang Cheng
To automate the monitoring process, one of the promising solutions is to leverage existing object detection models to detect the faces with or without masks.
1 code implementation • ICCV 2021 • Yi-Lun Wu, Hong-Han Shuai, Zhi-Rui Tam, Hong-Yu Chiu
In this paper, we propose a novel normalization method called gradient normalization (GN) to tackle the training instability of Generative Adversarial Networks (GANs) caused by the sharp gradient space.
no code implementations • 8 Jul 2021 • Hong-Xia Xie, I-Hsuan Li, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng
In this work, we describe our method for tackling the valence-arousal estimation challenge from ABAW2 ICCV-2021 Competition.
1 code implementation • 25 May 2021 • Yunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao, Wai Keung Wong, Tat-Seng Chua, Jinyoung Moon, Hong-Han Shuai
This companion paper supports the replication of the fashion trend forecasting experiments with the KERN (Knowledge Enhanced Recurrent Network) method that we presented in the ICMR 2020.
no code implementations • 8 Mar 2021 • Lei Chen, Shao-En Weng, Chu-Jun Peng, Hong-Han Shuai, Wen-Huang Cheng
Network security has been an active research topic for long.
1 code implementation • 18 Feb 2021 • Yi-Syuan Chen, Hong-Han Shuai
Neural abstractive summarization has been studied in many pieces of literature and achieves great success with the aid of large corpora.
no code implementations • 6 Feb 2021 • Chien-Lung Chou, Chieh-Yun Chen, Chia-Wei Hsieh, Hong-Han Shuai, Jiaying Liu, Wen-Huang Cheng
Afterward, given an in-shop clothing image, a user image, and a synthesized pose, we propose a novel model for synthesizing a human try-on image with the target clothing in the best fitting pose.
1 code implementation • 29 Jan 2021 • Yu-Jen Ma, Hong-Han Shuai, Wen-Huang Cheng
In this paper, we propose a novel SpatioTemporal convolutional Dense Network (STDNet) to address the video-based crowd counting problem, which contains the decomposition of 3D convolution and the 3D spatiotemporal dilated dense convolution to alleviate the rapid growth of the model size caused by the Conv3D layer.
1 code implementation • ICCV 2021 • Chieh-Yun Chen, Ling Lo, Pin-Jui Huang, Hong-Han Shuai, Wen-Huang Cheng
In the second stage, we first remove the clothes on the source human via the removed mask and warp the clothing features conditioning on the try-on clothing mask to fit the next frame human.
no code implementations • 21 Dec 2020 • Hong-Xia Xie, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng
Facial micro-expressions indicate brief and subtle facial movements that appear during emotional communication.
1 code implementation • 31 Oct 2020 • Dang-Khoa Nguyen, Wei-Lun Tseng, Hong-Han Shuai
Domain adaptation aims to transfer knowledge from the sourcedata with annotations to scarcely-labeled data in the target domain, which has attracted a lot of attention in recent years and facilitatedmany multimedia applications.
no code implementations • 19 Apr 2020 • Ling Lo, Hong-Xia Xie, Hong-Han Shuai, Wen-Huang Cheng
Micro-Expression (ME) is the spontaneous, involuntary movement of a face that can reveal the true feeling.
1 code implementation • 11 Feb 2020 • Shao-Heng Ko, Hsu-Chao Lai, Hong-Han Shuai, De-Nian Yang, Wang-Chien Lee, Philip S. Yu
Shopping in VR malls has been regarded as a paradigm shift for E-commerce, but most of the conventional VR shopping platforms are designed for a single user.
Data Structures and Algorithms
1 code implementation • 6 Feb 2020 • Yun-Zhu Song, Hong-Han Shuai, Sung-Lin Yeh, Yi-Lun Wu, Lun-Wei Ku, Wen-Chih Peng
To generate inspired headlines, we propose a novel framework called POpularity-Reinforced Learning for inspired Headline Generation (PORL-HG).
no code implementations • 25 Sep 2019 • Li-Chun Wang, Chuan-Chi Lai, Hong-Han Shuai, Hsin-Piao Lin, Chi-Yu Li, Teng-Hu Cheng, Chiun-Hsun Chen
Therefore, we propose to develop an "Artificial Intelligence (AI) Drone-Cruiser" base station that can help 5G mobile communication systems and beyond quickly recover the network after a disaster and handle the instant communications by the flash crowd.