no code implementations • 24 Jul 2024 • Yiming Xie, Chun-Han Yao, Vikram Voleti, Huaizu Jiang, Varun Jampani
We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-view consistent dynamic 3D content generation.
no code implementations • 17 Jul 2024 • Lei Zhong, Yiming Xie, Varun Jampani, Deqing Sun, Huaizu Jiang
We introduce a novel Stylized Motion Diffusion model, dubbed SMooDi, to generate stylized motion driven by content texts and style motion sequences.
no code implementations • CVPR 2024 • Yiming Xie, Henglu Wei, Zhenyi Liu, Xiaoyu Wang, Xiangyang Ji
To advance research in learning-based defogging algorithms, various synthetic fog datasets have been developed.
no code implementations • 11 Dec 2023 • Xiaogang Peng, Yiming Xie, Zizhao Wu, Varun Jampani, Deqing Sun, Huaizu Jiang
We also develop an affordance prediction diffusion model (APDM) to predict the contacting area between the human and object during the interactions driven by the textual prompt.
1 code implementation • 12 Oct 2023 • Yiming Xie, Varun Jampani, Lei Zhong, Deqing Sun, Huaizu Jiang
We present a novel approach named OmniControl for incorporating flexible spatial control signals into a text-conditioned human motion generation model based on the diffusion process.
no code implementations • ICCV 2023 • Yiming Xie, Huaizu Jiang, Georgia Gkioxari, Julian Straub
We present PARQ - a multi-view 3D object detector with transformer and pixel-aligned recurrent queries.
1 code implementation • 16 Aug 2023 • Fangrui Zhu, Yiming Xie, Weidi Xie, Huaizu Jiang
To address this issue, in this paper, we introduce a diagnosis toolbox to provide detailed quantitative break-down analysis of HOI detection models, inspired by the success of object detection diagnosis toolboxes.
1 code implementation • 3 Jul 2023 • Aniket Gupta, Yiming Xie, Hanumant Singh, Huaizu Jiang
Previous dominant point cloud registration approaches match these feature representations as the first step, e. g., using the Sinkhorn algorithm.
1 code implementation • CVPR 2022 • Yiming Xie, Matheus Gadelha, Fengting Yang, Xiaowei Zhou, Huaizu Jiang
We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video.
3 code implementations • CVPR 2021 • Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, Hujun Bao
We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video.
no code implementations • ICCV 2021 • Jiaming Sun, Yiming Xie, Siyu Zhang, Linghao Chen, Guofeng Zhang, Hujun Bao, Xiaowei Zhou
In this work, we propose a novel system for integrated 3D object detection and tracking, which uses a dynamic object occupancy map and previous object states as spatial-temporal memory to assist object detection in future frames.
no code implementations • 17 Nov 2020 • Roman Kolcun, Diana Andreea Popescu, Vadim Safronov, Poonam Yadav, Anna Maria Mandalari, Yiming Xie, Richard Mortier, Hamed Haddadi
We therefore evaluate our approach using hardware resources and data sources representative of those that would be available at the edge of the network, such as in an IoT deployment.
1 code implementation • CVPR 2020 • Jiaming Sun, Linghao Chen, Yiming Xie, Siyu Zhang, Qinhong Jiang, Xiaowei Zhou, Hujun Bao
In this paper, we propose a novel system named Disp R-CNN for 3D object detection from stereo images.
3D Object Detection From Stereo Images Disparity Estimation +2