Search Results for author: Hangjie Yuan

Found 17 papers, 13 papers with code

Make Continual Learning Stronger via C-Flat

no code implementations1 Apr 2024 Ang Bian, Wei Li, Hangjie Yuan, Chengrong Yu, Zixiang Zhao, Mang Wang, Aojun Lu, Tao Feng

A general framework of C-Flat applied to all CL categories and a thorough comparison with loss minima optimizer and flat minima based CL approaches is presented in this paper, showing that our method can boost CL performance in almost all cases.

Continual Learning

LUM-ViT: Learnable Under-sampling Mask Vision Transformer for Bandwidth Limited Optical Signal Acquisition

1 code implementation3 Mar 2024 Lingfeng Liu, Dong Ni, Hangjie Yuan

To tackle this hurdle, we introduce a novel approach leveraging pre-acquisition modulation to reduce the acquisition volume.

Binarization

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos

1 code implementation25 Dec 2023 Xiang Wang, Shiwei Zhang, Hangjie Yuan, Zhiwu Qing, Biao Gong, Yingya Zhang, Yujun Shen, Changxin Gao, Nong Sang

Following such a pipeline, we study the effect of doubling the scale of training set (i. e., video-only WebVid10M) with some randomly collected text-free videos and are encouraged to observe the performance improvement (FID from 9. 67 to 8. 19 and FVD from 484 to 441), demonstrating the scalability of our approach.

Text-to-Image Generation Text-to-Video Generation +2

InstructVideo: Instructing Video Diffusion Models with Human Feedback

1 code implementation19 Dec 2023 Hangjie Yuan, Shiwei Zhang, Xiang Wang, Yujie Wei, Tao Feng, Yining Pan, Yingya Zhang, Ziwei Liu, Samuel Albanie, Dong Ni

To tackle this problem, we propose InstructVideo to instruct text-to-video diffusion models with human feedback by reward fine-tuning.

Video Generation

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

1 code implementation7 Dec 2023 Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan

In motion learning, we architect a motion adapter and fine-tune it on the given videos to effectively model the target motion pattern.

Image Generation Video Generation

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

3 code implementations7 Nov 2023 Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou

By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos.

From Denoising Training to Test-Time Adaptation: Enhancing Domain Generalization for Medical Image Segmentation

1 code implementation31 Oct 2023 Ruxue Wen, Hangjie Yuan, Dong Ni, Wenbo Xiao, Yaoyao Wu

In medical image segmentation, domain generalization poses a significant challenge due to domain shifts caused by variations in data acquisition devices and other factors.

Denoising Domain Generalization +5

Few-shot Action Recognition with Captioning Foundation Models

no code implementations16 Oct 2023 Xiang Wang, Shiwei Zhang, Hangjie Yuan, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang

In this paper, we develop an effective plug-and-play framework called CapFSAR to exploit the knowledge of multimodal models without manually annotating text.

Few-Shot action recognition Few Shot Action Recognition

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

3 code implementations ICCV 2023 Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao

In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.

 Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)

Graph Generation Human-Object Interaction Detection +6

ModelScope Text-to-Video Technical Report

3 code implementations12 Aug 2023 Jiuniu Wang, Hangjie Yuan, Dayou Chen, Yingya Zhang, Xiang Wang, Shiwei Zhang

This paper introduces ModelScopeT2V, a text-to-video synthesis model that evolves from a text-to-image synthesis model (i. e., Stable Diffusion).

Denoising Image Generation +1

Refined Response Distillation for Class-Incremental Player Detection

no code implementations1 May 2023 Liang Bai, Hangjie Yuan, Tao Feng, Hong Song, Jian Yang

Furthermore, we present the NBA-IOD and Volleyball-IOD datasets as the benchmark and investigate the IOD tasks of the players systematically.

Knowledge Distillation object-detection +1

Progressive Learning without Forgetting

no code implementations28 Nov 2022 Tao Feng, Hangjie Yuan, Mang Wang, Ziyuan Huang, Ang Bian, Jianzhou Zhang

Learning from changing tasks and sequential experience without forgetting the obtained knowledge is a challenging problem for artificial neural networks.

Continual Learning

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

3 code implementations5 Sep 2022 Hangjie Yuan, Jianwen Jiang, Samuel Albanie, Tao Feng, Ziyuan Huang, Dong Ni, Mingqian Tang

The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.

Human-Object Interaction Detection Relation +1

Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation

1 code implementation CVPR 2022 Tao Feng, Mang Wang, Hangjie Yuan

In this paper, we propose a response-based incremental distillation method, dubbed Elastic Response Distillation (ERD), which focuses on elastically learning responses from the classification head and the regression head.

Class-Incremental Object Detection Incremental Learning +3

Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics

1 code implementation1 Feb 2022 Hangjie Yuan, Mang Wang, Dong Ni, Liangpeng Xu

Specifically, We propose to utilize a Verb Semantic Model (VSM) and use semantic aggregation to profit from this object-guided hierarchy.

Human-Object Interaction Detection Object +2

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

2 code implementations ICCV 2021 Hangjie Yuan, Dong Ni, Mang Wang

Within each interaction field, we apply DR to predict the relation matrix and DW to predict the dynamic walk offsets in a joint-processing manner, thus forming a person-specific interaction graph.

Group Activity Recognition Relation

Cannot find the paper you are looking for? You can Submit a new open access paper.