Search Results for author: Xiaoyu Shi

Found 20 papers, 10 papers with code

GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking

no code implementations5 Jan 2025 Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li

Specifically, we propose a novel framework that constructs a pseudo 4D Gaussian field with dense 3D point tracking and renders the Gaussian field for all video frames.

Novel View Synthesis Point Tracking +1

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

no code implementations10 Dec 2024 Xiao Fu, Xian Liu, Xintao Wang, Sida Peng, Menghan Xia, Xiaoyu Shi, Ziyang Yuan, Pengfei Wan, Di Zhang, Dahua Lin

Previous methods on controllable video generation primarily leverage 2D control signals to manipulate object motions and have achieved remarkable synthesis results.

Video Generation

BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events

no code implementations27 Oct 2024 Yijin Li, Yichen Shen, Zhaoyang Huang, Shuo Chen, Weikang Bian, Xiaoyu Shi, Fu-Yun Wang, Keqiang Sun, Hujun Bao, Zhaopeng Cui, Guofeng Zhang, Hongsheng Li

BlinkVision enables extensive benchmarks on three types of correspondence tasks (optical flow, point tracking, and scene flow estimation) for both image-based and event-based methods, offering new observations, practices, and insights for future research.

Event-based vision Optical Flow Estimation +2

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

1 code implementation20 Mar 2024 Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li

We introduce MOTIA Mastering Video Outpainting Through Input-Specific Adaptation, a diffusion-based pipeline that leverages both the intrinsic data-specific patterns of the source video and the image/video generative prior for effective outpainting.

AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data

1 code implementation1 Feb 2024 Fu-Yun Wang, Zhaoyang Huang, Weikang Bian, Xiaoyu Shi, Keqiang Sun, Guanglu Song, Yu Liu, Hongsheng Li

This paper introduces an effective method for computation-efficient personalized style video generation without requiring access to any personalized video data.

Conditional Image Generation Denoising +2

Cross-modality Attention Adapter: A Glioma Segmentation Fine-tuning Method for SAM Using Multimodal Brain MR Images

no code implementations3 Jul 2023 Xiaoyu Shi, Shurong Chai, Yinhao Li, Jingliang Cheng, Jie Bai, Guohua Zhao, Yen-Wei Chen

However, for medical images with small dataset sizes, deep learning methods struggle to achieve better results on real-world image datasets.

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow

no code implementations8 Jun 2023 Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Yijin Li, Hongwei Qin, Jifeng Dai, Xiaogang Wang, Hongsheng Li

This paper introduces a novel transformer-based network architecture, FlowFormer, along with the Masked Cost Volume AutoEncoding (MCVA) for pretraining it to tackle the problem of optical flow estimation.

Decoder Optical Flow Estimation

Context-PIPs: Persistent Independent Particles Demands Spatial Context Features

no code implementations3 Jun 2023 Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yitong Dong, Yijin Li, Hongsheng Li

We tackle the problem of Persistent Independent Particles (PIPs), also called Tracking Any Point (TAP), in videos, which specifically aims at estimating persistent long-term trajectories of query points in videos.

Point Tracking

BlinkFlow: A Dataset to Push the Limits of Event-based Optical Flow Estimation

no code implementations14 Mar 2023 Yijin Li, Zhaoyang Huang, Shuo Chen, Xiaoyu Shi, Hongsheng Li, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40\% on average and up to 90\%.

Event-based Optical Flow Optical Flow Estimation

KBNet: Kernel Basis Network for Image Restoration

1 code implementation6 Mar 2023 Yi Zhang, Dasong Li, Xiaoyu Shi, Dailan He, Kangning Song, Xiaogang Wang, Hongwei Qin, Hongsheng Li

In this paper, we propose a kernel basis attention (KBA) module, which introduces learnable kernel bases to model representative image patterns for spatial information aggregation.

Color Image Denoising Deblurring +4

Unsupervised Domain Adaptive Fundus Image Segmentation with Category-level Regularization

1 code implementation8 Jul 2022 Wei Feng, Lin Wang, Lie Ju, Xin Zhao, Xin Wang, Xiaoyu Shi, ZongYuan Ge

Existing unsupervised domain adaptation methods based on adversarial learning have achieved good performance in several medical imaging tasks.

Image Segmentation Semantic Segmentation +1

Learning the policy for mixed electric platoon control of automated and human-driven vehicles at signalized intersection: a random search approach

no code implementations24 Jun 2022 Xia Jiang, Jian Zhang, Xiaoyu Shi, Jian Cheng

Meanwhile, the simulation results demonstrate the effectiveness of the delay reward, which is designed to outperform distributed reward mechanism} Compared with normal car-following behavior, the sensitivity analysis reveals that the energy can be saved to different extends (39. 27%-82. 51%) by adjusting the relative importance of the optimization goal.

reinforcement-learning Reinforcement Learning (RL)

FlowFormer: A Transformer Architecture for Optical Flow

1 code implementation30 Mar 2022 Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li

We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural network architecture for learning optical flow.

Decoder Optical Flow Estimation

Exploring the Quality of GAN Generated Images for Person Re-Identification

no code implementations23 Aug 2021 Yiqi Jiang, Weihua Chen, Xiuyu Sun, Xiaoyu Shi, Fan Wang, Hao Li

Recently, GAN based method has demonstrated strong effectiveness in generating augmentation data for person re-identification (ReID), on account of its ability to bridge the gap between domains and enrich the data variety in feature space.

Diversity Person Re-Identification +1

Decoupled Spatial-Temporal Transformer for Video Inpainting

1 code implementation14 Apr 2021 Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu Sun, Xiaogang Wang, Jifeng Dai, Hongsheng Li

Seamless combination of these two novel designs forms a better spatial-temporal attention scheme and our proposed model achieves better performance than state-of-the-art video inpainting approaches with significant boosted efficiency.

Video Inpainting

Cannot find the paper you are looking for? You can Submit a new open access paper.