Search Results for author: Zuozhuo Dai

Found 16 papers, 8 papers with code

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

1 code implementation21 Mar 2024 Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu

In this study, we introduce a methodology for human image animation by leveraging a 3D human parametric model within a latent diffusion framework to enhance shape alignment and motion guidance in curernt human generative techniques.

Animated GIF Generation Image Animation +1

EffiVED:Efficient Video Editing via Text-instruction Diffusion Models

no code implementations18 Mar 2024 Zhenghao Zhang, Zuozhuo Dai, Long Qin, Weizhi Wang

Large-scale text-to-video models have shown remarkable abilities, but their direct application in video editing remains challenging due to limited available datasets.

Video Editing

Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle

no code implementations6 Dec 2023 Youtian Lin, Zuozhuo Dai, Siyu Zhu, Yao Yao

Moreover, the explicit deformation modeling for discretized Gaussian points ensures ultra-fast training and rendering of a 4D scene, which is comparable to the original 3DGS designed for static 3D reconstruction.

3D Reconstruction 4D reconstruction +1

Fine-grained Text-Video Retrieval with Frozen Image Encoders

no code implementations14 Jul 2023 Zuozhuo Dai, Fangtao Shao, Qingkun Su, Zilong Dong, Siyu Zhu

In the second stage, we propose a novel decoupled video text cross attention module to capture fine-grained multimodal information in spatial and temporal dimensions.

Retrieval Video Retrieval

UVOSAM: A Mask-free Paradigm for Unsupervised Video Object Segmentation via Segment Anything Model

no code implementations22 May 2023 Zhenghao Zhang, Zhichao Wei, Shengfan Zhang, Zuozhuo Dai, Siyu Zhu

Unsupervised video object segmentation has made significant progress in recent years, but the manual annotation of video mask datasets is expensive and limits the diversity of available datasets.

Image Segmentation Object +5

Towards Robust Video Instance Segmentation with Temporal-Aware Transformer

no code implementations20 Jan 2023 Zhenghao Zhang, Fangtao Shao, Zuozhuo Dai, Siyu Zhu

In this paper, we observe the temporal information is important as well and we propose TAFormer to aggregate spatio-temporal features both in transformer encoder and decoder.

Instance Segmentation Semantic Segmentation +1

RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

no code implementations23 May 2022 Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan

In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task.

Motion Estimation Point Cloud Registration +1

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation

1 code implementation CVPR 2022 Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan

While recent works design increasingly complicated and powerful networks to directly regress the depth map, we take the path of CRFs optimization.

Depth Prediction Monocular Depth Estimation

RCP: Recurrent Closest Point for Point Cloud

1 code implementation CVPR 2022 Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan

In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task.

Motion Estimation Point Cloud Registration +1

Neural Window Fully-Connected CRFs for Monocular Depth Estimation

no code implementations CVPR 2022 Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan

Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed.

Monocular Depth Estimation

DRO: Deep Recurrent Optimizer for Video to Depth

1 code implementation24 Mar 2021 Xiaodong Gu, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Chengzhou Tang, Zilong Dong, Ping Tan

There are increasing interests of studying the video-to-depth (V2D) problem with machine learning techniques.

MeshMVS: Multi-View Stereo Guided Mesh Reconstruction

no code implementations17 Oct 2020 Rakesh Shrestha, Zhiwen Fan, Qingkun Su, Zuozhuo Dai, Siyu Zhu, Ping Tan

Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects and guide the shape generation process.

3D Shape Generation

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

4 code implementations CVPR 2020 Xiaodong Gu, Zhiwen Fan, Zuozhuo Dai, Siyu Zhu, Feitong Tan, Ping Tan

The deep multi-view stereo (MVS) and stereo matching approaches generally construct 3D cost volumes to regularize and regress the output depth or disparity.

3D Reconstruction Point Clouds +1

Batch DropBlock Network for Person Re-identification and Beyond

5 code implementations ICCV 2019 Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, Ping Tan

In this paper, we propose the Batch DropBlock (BDB) Network which is a two branch network composed of a conventional ResNet-50 as the global branch and a feature dropping branch.

Image Retrieval Metric Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.