Search Results for author: Fuchen Long

Found 12 papers, 8 papers with code

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

no code implementations25 Mar 2024 Zhongwei Zhang, Fuchen Long, Yingwei Pan, Zhaofan Qiu, Ting Yao, Yang Cao, Tao Mei

Next, TRIP executes a residual-like dual-path scheme for noise prediction: 1) a shortcut path that directly takes image noise prior as the reference noise of each frame to amplify the alignment between the first frame and subsequent frames; 2) a residual path that employs 3D-UNet over noised video and static image latent codes to enable inter-frame relational reasoning, thereby easing the learning of the residual noise for each frame.

Image to Video Generation Relational Reasoning +1

Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution

no code implementations25 Mar 2024 Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, Tao Mei

Technically, SATeCo freezes all the parameters of the pre-trained UNet and VAE, and only optimizes two deliberately-designed spatial feature adaptation (SFA) and temporal feature alignment (TFA) modules, in the decoder of UNet and VAE.

Denoising Image Super-Resolution +3

VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM

no code implementations2 Jan 2024 Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei

The diffusion model incorporates the reference images as the condition and alignment to strengthen the content consistency of multi-scene videos.

Descriptive Video Generation

PointClustering: Unsupervised Point Cloud Pre-Training Using Transformation Invariance in Clustering

1 code implementation CVPR 2023 Fuchen Long, Ting Yao, Zhaofan Qiu, Lusong Li, Tao Mei

Feature invariance under different data transformations, i. e., transformation invariance, can be regarded as a type of self-supervision for representation learning.

Clustering Deep Clustering +4

Dynamic Temporal Filtering in Video Models

1 code implementation15 Nov 2022 Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei

The pre-determined kernel size severely limits the temporal receptive fields and the fixed weights treat each spatial location across frames equally, resulting in sub-optimal solution for long-range temporal modeling in natural scenes.

Bi-Calibration Networks for Weakly-Supervised Video Representation Learning

1 code implementation21 Jun 2022 Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

The video-to-text/video-to-query projections over text prototypes/query vocabulary then start the text-to-query or query-to-text calibration to estimate the amendment to query or text.

Representation Learning

Stand-Alone Inter-Frame Attention in Video Models

1 code implementation CVPR 2022 Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Jiebo Luo, Tao Mei

In this paper, we present a new recipe of inter-frame attention block, namely Stand-alone Inter-Frame Attention (SIFA), that novelly delves into the deformation across frames to estimate local self-attention on each spatial location.

Action Classification Action Recognition +1

Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation

1 code implementation13 Jun 2022 Yingwei Pan, Yehao Li, Yiheng Zhang, Qi Cai, Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei

This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track: The No Interaction track targets for learning policies from pre-collected demonstration trajectories.

Imitation Learning

Learning to Localize Actions from Moments

1 code implementation ECCV 2020 Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

In this paper, we introduce a new design of transfer learning type to learn action localization for a large set of action categories, but only on action moments from the categories of interest and temporal annotations of untrimmed videos from a small set of action classes.

Action Localization Transfer Learning

vireoJD-MM at Activity Detection in Extended Videos

no code implementations20 Jun 2019 Fuchen Long, Qi Cai, Zhaofan Qiu, Zhijian Hou, Yingwei Pan, Ting Yao, Chong-Wah Ngo

This notebook paper presents an overview and comparative analysis of our system designed for activity detection in extended videos (ActEV-PC) in ActivityNet Challenge 2019.

Action Detection Action Localization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.