Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

1 code implementation8 May 2024 Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei Liu

Efficient data utilization is crucial for advancing 3D scene understanding in autonomous driving, where reliance on heavily human-annotated LiDAR point clouds challenges fully supervised methods.

Autonomous Driving LIDAR Semantic Segmentation +2

Move Anything with Layered Scene Diffusion

no code implementations10 Apr 2024 Jiawei Ren, Mengmeng Xu, Jui-Chieh Wu, Ziwei Liu, Tao Xiang, Antoine Toisoul

Diffusion models generate images with an unprecedented level of quality, but how can we freely rearrange image layouts?

Denoising Disentanglement

InsActor: Instruction-driven Physics-based Characters

no code implementations NeurIPS 2023 Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Xiao Ma, Liang Pan, Ziwei Liu

Generating animation of physics-based characters with intuitive control has long been a desirable task with numerous applications.

Motion Planning

DreamGaussian4D: Generative 4D Gaussian Splatting

1 code implementation28 Dec 2023 Jiawei Ren, Liang Pan, Jiaxiang Tang, Chi Zhang, Ang Cao, Gang Zeng, Ziwei Liu

Remarkable progress has been made in 4D content generation recently.

FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing

1 code implementation NeurIPS 2023 Mingyuan Zhang, Huirong Li, Zhongang Cai, Jiawei Ren, Lei Yang, Ziwei Liu

Notably, FineMoGen further enables zero-shot motion editing capabilities with the aid of modern large language models (LLM), which faithfully manipulates motion sequences with fine-grained instructions.

Motion Synthesis

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

no code implementations9 Oct 2023 Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He

In this paper, for the first time, we introduce optical flow into the attention module in the diffusion model's U-Net to address the inconsistency issue for text-to-video editing.

Optical Flow Estimation Text-to-Video Editing +1

DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation

1 code implementation28 Sep 2023 Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, Gang Zeng

In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks.

3D Generation

Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform

no code implementations31 May 2023 Yuan Xie, Jiawei Ren, Ji Xu

Background noise and variable channel transmission environment make it complicated to implement accurate ship-radiated noise recognition.

Transfer Learning

Underwater-Art: Expanding Information Perspectives With Text Templates For Underwater Acoustic Target Recognition

no code implementations31 May 2023 Yuan Xie, Jiawei Ren, Ji Xu

In our work, we propose to implement Underwater Acoustic Recognition based on Templates made up of rich relevant information (hereinafter called "UART").

Contrastive Learning Descriptive

RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions

1 code implementation13 Apr 2023 Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

Our experiments further demonstrate that pre-training and depth-free BEV transformation has the potential to enhance out-of-distribution robustness.

Robust Camera Only 3D Object Detection

Sparse Mixture-of-Experts are Domain Generalizable Learners

1 code implementation8 Jun 2022 Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, Ziwei Liu

It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets.

Ranked #11 on Domain Generalization on DomainNet (using extra training data)

Domain Generalization Object Recognition

Balanced MSE for Imbalanced Visual Regression

1 code implementation CVPR 2022 Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu

Data imbalance exists ubiquitously in real-world visual regressions, e. g., age estimation and pose estimation, hurting the model's generalizability and fairness.

Age Estimation Fairness +3

Playing for 3D Human Recovery

no code implementations14 Oct 2021 Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu

Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.

Bayesian Imbalanced Regression Debiasing

no code implementations29 Sep 2021 Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu

Compared to imbalanced and long-tailed classification, imbalanced regression has its unique challenges as the regression label space can be continuous, boundless, and high-dimensional.

Age Estimation imbalanced classification +2

REFINE: Prediction Fusion Network for Panoptic Segmentation

no code implementations15 Dec 2020 Jiawei Ren, Cunjun Yu, Zhongang Cai, Mingyuan Zhang, Chongsong Chen, Haiyu Zhao, Shuai Yi, Hongsheng Li

Panoptic segmentation aims at generating pixel-wise class and instance predictions for each pixel in the input image, which is a challenging task and far more complicated than naively fusing the semantic and instance segmentation results.

Instance Segmentation Panoptic Segmentation +1

Leveraging Localization for Multi-camera Association

no code implementations7 Aug 2020 Zhongang Cai, Cunjun Yu, Junzhe Zhang, Jiawei Ren, Haiyu Zhao

We present McAssoc, a deep learning approach to the as-sociation of detection bounding boxes in different views ofa multi-camera system.

Balanced Meta-Softmax for Long-Tailed Visual Recognition

1 code implementation NeurIPS 2020 Jiawei Ren, Cunjun Yu, Shunan Sheng, Xiao Ma, Haiyu Zhao, Shuai Yi, Hongsheng Li

In our experiments, we demonstrate that Balanced Meta-Softmax outperforms state-of-the-art long-tailed classification solutions on both visual recognition and instance segmentation tasks.

General Classification Instance Segmentation +2

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction

1 code implementation ECCV 2020 Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, Shuai Yi

In this paper, we present STAR, a Spatio-Temporal grAph tRansformer framework, which tackles trajectory prediction by only attention mechanisms.

Autonomous Driving Pedestrian Trajectory Prediction +1

