Search Results for author: Fuchen Long

Found 12 papers, 8 papers with code

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

no code implementations • 25 Mar 2024 • Zhongwei Zhang, Fuchen Long, Yingwei Pan, Zhaofan Qiu, Ting Yao, Yang Cao, Tao Mei

Next, TRIP executes a residual-like dual-path scheme for noise prediction: 1) a shortcut path that directly takes image noise prior as the reference noise of each frame to amplify the alignment between the first frame and subsequent frames; 2) a residual path that employs 3D-UNet over noised video and static image latent codes to enable inter-frame relational reasoning, thereby easing the learning of the residual noise for each frame.

Image to Video Generation Relational Reasoning +1

Paper
Add Code

Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution

no code implementations • 25 Mar 2024 • Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, Tao Mei

Technically, SATeCo freezes all the parameters of the pre-trained UNet and VAE, and only optimizes two deliberately-designed spatial feature adaptation (SFA) and temporal feature alignment (TFA) modules, in the decoder of UNet and VAE.

Denoising Image Super-Resolution +3

Paper
Add Code

VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM

no code implementations • 2 Jan 2024 • Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei

The diffusion model incorporates the reference images as the condition and alignment to strengthen the content consistency of multi-scene videos.

Descriptive Video Generation

Paper
Add Code

PointClustering: Unsupervised Point Cloud Pre-Training Using Transformation Invariance in Clustering

1 code implementation • CVPR 2023 • Fuchen Long, Ting Yao, Zhaofan Qiu, Lusong Li, Tao Mei

Feature invariance under different data transformations, i. e., transformation invariance, can be regarded as a type of self-supervision for representation learning.

Clustering Deep Clustering +4

Paper
Code

AnchorFormer: Point Cloud Completion From Discriminative Nodes

1 code implementation • CVPR 2023 • Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, Tao Mei

Point cloud completion aims to recover the completed 3D shape of an object from its partial observation.

MORPH Object +1

Paper
Code

Dynamic Temporal Filtering in Video Models

1 code implementation • 15 Nov 2022 • Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei

The pre-determined kernel size severely limits the temporal receptive fields and the fixed weights treat each spatial location across frames equally, resulting in sub-optimal solution for long-range temporal modeling in natural scenes.

Paper
Code

Bi-Calibration Networks for Weakly-Supervised Video Representation Learning

1 code implementation • 21 Jun 2022 • Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

The video-to-text/video-to-query projections over text prototypes/query vocabulary then start the text-to-query or query-to-text calibration to estimate the amendment to query or text.

Representation Learning

Paper
Code

Stand-Alone Inter-Frame Attention in Video Models

1 code implementation • CVPR 2022 • Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Jiebo Luo, Tao Mei

In this paper, we present a new recipe of inter-frame attention block, namely Stand-alone Inter-Frame Attention (SIFA), that novelly delves into the deformation across frames to estimate local self-attention on each spatial location.

Ranked #13 on Action Recognition on Something-Something V1

Action Classification Action Recognition +1

Paper
Code

Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation

1 code implementation • 13 Jun 2022 • Yingwei Pan, Yehao Li, Yiheng Zhang, Qi Cai, Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei

This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track: The No Interaction track targets for learning policies from pre-collected demonstration trajectories.

Imitation Learning

Paper
Code

Learning to Localize Actions from Moments

1 code implementation • ECCV 2020 • Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

In this paper, we introduce a new design of transfer learning type to learn action localization for a large set of action categories, but only on action moments from the categories of interest and temporal annotations of untrimmed videos from a small set of action classes.

Action Localization Transfer Learning

Paper
Code

Gaussian Temporal Awareness Networks for Action Localization

1 code implementation • CVPR 2019 • Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, Tao Mei

Temporally localizing actions in a video is a fundamental challenge in video understanding.

Action Localization object-detection +2

Paper
Code

vireoJD-MM at Activity Detection in Extended Videos

no code implementations • 20 Jun 2019 • Fuchen Long, Qi Cai, Zhaofan Qiu, Zhijian Hou, Yingwei Pan, Ting Yao, Chong-Wah Ngo

This notebook paper presents an overview and comparative analysis of our system designed for activity detection in extended videos (ActEV-PC) in ActivityNet Challenge 2019.

Action Detection Action Localization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.