Search Results for author: Dong Chen

Found 108 papers, 69 papers with code

Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification

no code implementations • CVPR 2013 • Dong Chen, Xudong Cao, Fang Wen, Jian Sun

Making a high-dimensional (e. g., 100K-dim) feature for face recognition seems not a good idea because it will bring difficulties on consequent training, computation, and storage.

Ranked #9 on Age-Invariant Face Recognition on CACDVS

Age-Invariant Face Recognition Face Verification +1

Paper
Add Code

TwittDict: Extracting Social Oriented Keyphrase Semantics from Twitter

no code implementations • WS 2015 • Suppawong Tuarob, Wanghuan Chu, Dong Chen, Conrad Tucker

Paper
Add Code

Neural Aggregation Network for Video Face Recognition

no code implementations • CVPR 2017 • Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua

The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition.

Ranked #7 on Face Verification on IJB-A

Face Recognition Face Verification

Paper
Add Code

Supervised Transformer Network for Efficient Face Detection

no code implementations • 19 Jul 2016 • Dong Chen, Gang Hua, Fang Wen, Jian Sun

For real-time performance, we run the cascaded network only on regions of interests produced from a boosting cascade face detector.

Ranked #5 on Face Detection on PASCAL Face

Face Detection Region Proposal +1

Paper
Add Code

CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training

3 code implementations • ICCV 2017 • Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua

Our approach models an image as a composition of label and latent attributes in a probabilistic model.

Attribute Data Augmentation +4

Paper
Code

Towards Open-Set Identity Preserving Face Synthesis

no code implementations • CVPR 2018 • Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua

We then recombine the identity vector and the attribute vector to synthesize a new face of the subject with the extracted attribute.

Attribute Face Generation

Paper
Add Code

A novel active learning framework for classification: using weighted rank aggregation to achieve multiple query criteria

no code implementations • 27 Sep 2018 • Yu Zhao, Zhenhui Shi, Jingyang Zhang, Dong Chen, Lixu Gu

The proposed method serves as a heuristic means to select high-value samples of high scalability and generality and is implemented through a three-step process: (1) the transformation of the sample selection to sample ranking and scoring, (2) the computation of the self-adaptive weights of each criterion, and (3) the weighted aggregation of each sample rank list.

Active Learning General Classification

Paper
Add Code

Exploring Hypergraph Representation on Face Anti-spoofing Beyond 2D Attacks

no code implementations • 28 Nov 2018 • Wei Hu, Gusi Te, Ju He, Dong Chen, Zongming Guo

Face anti-spoofing plays a crucial role in protecting face recognition systems from various attacks.

Face Anti-Spoofing Face Recognition

Paper
Add Code

WIDER Face and Pedestrian Challenge 2018: Methods and Results

no code implementations • 19 Feb 2019 • Chen Change Loy, Dahua Lin, Wanli Ouyang, Yuanjun Xiong, Shuo Yang, Qingqiu Huang, Dongzhan Zhou, Wei Xia, Quanquan Li, Ping Luo, Junjie Yan, Jian-Feng Wang, Zuoxin Li, Ye Yuan, Boxun Li, Shuai Shao, Gang Yu, Fangyun Wei, Xiang Ming, Dong Chen, Shifeng Zhang, Cheng Chi, Zhen Lei, Stan Z. Li, Hongkai Zhang, Bingpeng Ma, Hong Chang, Shiguang Shan, Xilin Chen, Wu Liu, Boyan Zhou, Huaxiong Li, Peng Cheng, Tao Mei, Artem Kukharenko, Artem Vasenin, Nikolay Sergievskiy, Hua Yang, Liangqi Li, Qiling Xu, Yuan Hong, Lin Chen, Mingjun Sun, Yirong Mao, Shiying Luo, Yongjun Li, Ruiping Wang, Qiaokang Xie, Ziyang Wu, Lei Lu, Yiheng Liu, Wengang Zhou

This paper presents a review of the 2018 WIDER Challenge on Face and Pedestrian.

Face Detection Pedestrian Detection +2

Paper
Add Code

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set

4 code implementations • 20 Mar 2019 • Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, Xin Tong

Recently, deep learning based 3D face reconstruction methods have shown promising results in both quality and efficiency. However, training deep neural networks typically requires a large volume of data, whereas face images with ground-truth 3D face shapes are scarce.

Ranked #3 on 3D Face Reconstruction on Florence (RMSE Cooperative metric)

3D Face Reconstruction Weakly-supervised Learning

2,088

Paper
Code

Mask-Guided Portrait Editing with Conditional GANs

no code implementations • CVPR 2019 • Shuyang Gu, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen, Lu Yuan

Portrait editing is a popular subject in photo manipulation.

Data Augmentation Face Generation +3

Paper
Add Code

Face Parsing with RoI Tanh-Warping

2 code implementations • CVPR 2019 • Jinpeng Lin, Hao Yang, Dong Chen, Ming Zeng, Fang Wen, Lu Yuan

It uses hierarchical local based method for inner facial components and global methods for outer facial components.

Face Parsing

373

Paper
Code

Deep Exemplar-based Video Colorization

1 code implementation • CVPR 2019 • Bo Zhang, Mingming He, Jing Liao, Pedro V. Sander, Lu Yuan, Amine Bermak, Dong Chen

This paper presents the first end-to-end network for exemplar-based video colorization.

Colorization Semantic correspondence

330

Paper
Code

Autonomous Driving using Safe Reinforcement Learning by Incorporating a Regret-based Human Lane-Changing Decision Model

no code implementations • 10 Oct 2019 • Dong Chen, Longsheng Jiang, Yue Wang, Zhaojian Li

The predicted decisions are incorporated in the safety constraints for reinforcement learning in training and in implementation.

Autonomous Driving Decision Making +3

Paper
Add Code

Face X-ray for More General Face Forgery Detection

4 code implementations • CVPR 2020 • Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, Baining Guo

For this reason, face X-ray provides an effective way for detecting forgery generated by most existing face manipulation algorithms.

DeepFake Detection Face Swapping

Paper
Code

FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping

10 code implementations • 31 Dec 2019 • Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen

We propose a novel attributes encoder for extracting multi-level target face attributes, and a new generator with carefully designed Adaptive Attentional Denormalization (AAD) layers to adaptively integrate the identity and the attributes for face synthesis.

Face Generation Face Swapping +1

590

Paper
Code

TCM-ICP: Transformation Compatibility Measure for Registering Multiple LIDAR Scans

no code implementations • 4 Jan 2020 • Aby Thomas, Adarsh Sunilkumar, Shankar Shylesh, Aby Abahai T., Subhasree Methirumangalath, Dong Chen, Jiju Peethambaran

In this work, we present an algorithm for registering multiple, overlapping LiDAR scans.

Paper
Add Code

Table-Top Scene Analysis Using Knowledge-Supervised MCMC

no code implementations • 19 Feb 2020 • Ziyuan Liu, Dong Chen, Kai M. Wurm, Georg von Wichert

Our approach to generate scene graphs is probabilistic: Uncertainty in the object poses is addressed by a probabilistic sensor model that is embedded in a data driven MCMC process.

Descriptive Object

Paper
Add Code

Online Semantic Exploration of Indoor Maps

no code implementations • 21 Feb 2020 • Ziyuan Liu, Dong Chen, Georg von Wichert

In this paper we propose a method to extract an abstracted floor plan from typical grid maps using Bayesian reasoning.

Paper
Add Code

GIQA: Generated Image Quality Assessment

1 code implementation • ECCV 2020 • Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

Generative adversarial networks (GANs) have achieved impressive results today, but not all generated images are perfect.

Image Quality Assessment

210

Paper
Code

Curved Buildings Reconstruction from Airborne LiDAR Data by Matching and Deforming Geometric Primitives

no code implementations • 22 Mar 2020 • Jingwei Song, Shaobo Xia, Jun Wang, Dong Chen

To this end, we propose a new framework for curved building reconstruction via assembling and deforming geometric primitives.

Paper
Add Code

Cross-domain Correspondence Learning for Exemplar-based Image Translation

3 code implementations • CVPR 2020 • Pan Zhang, Bo Zhang, Dong Chen, Lu Yuan, Fang Wen

The output has the style (e. g., color, texture) in consistency with the semantically corresponding objects in the exemplar.

Ranked #1 on Image-to-Image Translation on ADE20K-Outdoor Labels-to-Photos (FID metric)

Image-to-Image Translation Translation

386

Paper
Code

Bringing Old Photos Back to Life

7 code implementations • CVPR 2020 • Zi-Yu Wan, Bo Zhang, Dong-Dong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize.

Image Restoration Translation

14,410

Paper
Code

Uncertainty Quantification for Hyperspectral Image Denoising Frameworks based on Low-rank Matrix Approximation

1 code implementation • 23 Apr 2020 • Jingwei Song, Shaobo Xia, Jun Wang, Mitesh Patel, Dong Chen

Sliding-window based low-rank matrix approximation (LRMA) is a technique widely used in hyperspectral images (HSIs) denoising or completion.

Hyperspectral Image Denoising Image Denoising +1

Paper
Code

Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning

4 code implementations • CVPR 2020 • Yu Deng, Jiaolong Yang, Dong Chen, Fang Wen, Xin Tong

Our method can also be used to embed real images into the disentangled latent space.

Contrastive Learning Disentanglement +1

627

Paper
Code

Deep 3D Portrait from a Single Image

1 code implementation • CVPR 2020 • Sicheng Xu, Jiaolong Yang, Dong Chen, Fang Wen, Yu Deng, Yunde Jia, Xin Tong

We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images.

Face Model Stereo Matching

373

Paper
Code

Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition

1 code implementation • ECCV 2020 • Di Hu, Xuhong LI, Lichao Mou, Pu Jin, Dong Chen, Liping Jing, Xiaoxiang Zhu, Dejing Dou

With the help of this dataset, we evaluate three proposed approaches for transferring the sound event knowledge to the aerial scene recognition task in a multimodal learning framework, and show the benefit of exploiting the audio information for the aerial scene recognition.

Scene Recognition

Paper
Code

PriorGAN: Real Data Prior for Generative Adversarial Nets

1 code implementation • 30 Jun 2020 • Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

To address these two issues, we propose a novel prior that captures the whole real data distribution for GANs, which are called PriorGANs.

210

Paper
Code

Unified Representation Learning for Cross Model Compatibility

no code implementations • 11 Aug 2020 • Chien-Yi Wang, Ya-Liang Chang, Shang-Ta Yang, Dong Chen, Shang-Hong Lai

We propose a unified representation learning framework to address the Cross Model Compatibility (CMC) problem in the context of visual search applications.

Face Identification Face Recognition +2

Paper
Add Code

Old Photo Restoration via Deep Latent Space Translation

8 code implementations • 14 Sep 2020 • Zi-Yu Wan, Bo Zhang, Dong-Dong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Image Restoration Translation

14,410

Paper
Code

SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

3 code implementations • 19 Oct 2020 • Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham Bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang

We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving.

Autonomous Driving Multi-agent Reinforcement Learning +2

874

Paper
Code

GreedyFool: Distortion-Aware Sparse Adversarial Attack

1 code implementation • NeurIPS 2020 • Xiaoyi Dong, Dongdong Chen, Jianmin Bao, Chuan Qin, Lu Yuan, Weiming Zhang, Nenghai Yu, Dong Chen

Sparse adversarial samples are a special branch of adversarial samples that can fool the target model by only perturbing a few pixels.

Adversarial Attack

Paper
Code

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

no code implementations • 22 Nov 2020 • Shuyang Gu, Jianmin Bao, Dong Chen

A key challenge in video enhancement and action recognition is to fuse useful information from neighboring frames.

Action Recognition Denoising +3

Paper
Add Code

PowerNet: Multi-agent Deep Reinforcement Learning for Scalable Powergrid Control

no code implementations • 24 Nov 2020 • Dong Chen, Kaian Chen. Zhaojian Li, Tianshu Chu, Rui Yao, Feng Qiu, Kaixiang Lin

Specifically, we consider the decentralized inverter-based secondary voltage control problem in distributed generators (DGs), which is first formulated as a cooperative multi-agent reinforcement learning (MARL) problem.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

1 code implementation • CVPR 2021 • Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen

We present the full-resolution correspondence learning for cross-domain images, which aids image translation.

Image-to-Image Translation Semantic correspondence +1

334

Paper
Code

Identity-Driven DeepFake Detection

no code implementations • 7 Dec 2020 • Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Dong Chen, Fang Wen, Baining Guo

Our approach takes as input the suspect image/video as well as the target identity information (a reference image or video).

DeepFake Detection Face Swapping

Paper
Add Code

Evidence of topological nodal lines and surface states in the centrosymmetric superconductor SnTaS2

no code implementations • 7 Dec 2020 • Wenqing Chen, Lulu Liu, Wentao Yang, Dong Chen, Zhengtai Liu, Yaobo Huang, Tong Zhang, Haijun Zhang, Zhonghao Liu, D. W. Shen

Utilizing angle-resolved photoemission spectroscopy and first-principles calculations, here, we demonstrate the existence of topological nodal-line states and drumheadlike surface states in centrosymmetric superconductor SnTaS2, which is a type-II superconductor with a critical transition temperature of about 3 K. The valence bands from Ta 5d orbitals and the conduction bands from Sn 5p orbitals cross each other, forming two nodal lines in the vicinity of the Fermi energy without the inclusion of spin-orbit coupling (SOC), protected by the spatial-inversion symmetry and time-reversal symmetry.

Superconductivity

Paper
Add Code

Unsupervised Pre-training for Person Re-identification

1 code implementation • CVPR 2021 • Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, Dong Chen

In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation.

Ranked #1 on Person Re-Identification on Market-1501 (using extra training data)

Data Augmentation Person Re-Identification +1

214

Paper
Code

Robust Meta-learning with Noise via Eigen-Reptile

no code implementations • 1 Jan 2021 • Dong Chen, Lingfei Wu, Siliang Tang, Fangli Xu, Juncheng Li, Chang Zong, Chilie Tan, Yueting Zhuang

In particular, we first cast the meta-overfitting problem (overfitting on sampling and label noise) as a gradient noise problem since few available samples cause meta-learner to overfit on existing examples (clean or corrupted) of an individual task at every gradient step.

Few-Shot Learning

Paper
Add Code

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation

2 code implementations • CVPR 2021 • Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, Fang Wen

In this paper, we rely on representative prototypes, the feature centroids of classes, to address the two issues for unsupervised domain adaptation.

Ranked #10 on Semantic Segmentation on GTAV-to-Cityscapes Labels

Pseudo Label Semantic Segmentation +2

276

Paper
Code

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion

1 code implementation • CVPR 2021 • Chulin Xie, Chuxin Wang, Bo Zhang, Hao Yang, Dong Chen, Fang Wen

In this paper, we proposed a novel Style-based Point Generator with Adversarial Rendering (SpareNet) for point cloud completion.

Ranked #1 on Point Cloud Completion on ShapeNet (Earth Mover's Distance metric)

Point Cloud Completion

134

Paper
Code

Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression

1 code implementation • 22 Mar 2021 • Dong Chen, Duoqian Miao

In this paper, we first present an evaluation-feedback module, which is proposed to consist of evaluation system and feedback mechanism.

object-detection Object Detection +1

Paper
Code

High-Fidelity and Arbitrary Face Editing

no code implementations • CVPR 2021 • Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

However, we observe that the generator tends to find a tricky way to hide information from the original image to satisfy the constraint of cycle consistency, making it impossible to maintain the rich details (e. g., wrinkles and moles) of non-editing areas.

Attribute Vocal Bursts Intensity Prediction

Paper
Add Code

Deep Multi-agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic

3 code implementations • 12 May 2021 • Dong Chen, Mohammad Hajidavalloo, Zhaojian Li, Kaian Chen, Yongqiang Wang, Longsheng Jiang, Yue Wang

On-ramp merging is a challenging task for autonomous vehicles (AVs), especially in mixed traffic where AVs coexist with human-driven vehicles (HDVs).

Autonomous Vehicles reinforcement-learning +1

2,348

Paper
Code

Robust Mutual Learning for Semi-supervised Semantic Segmentation

no code implementations • 1 Jun 2021 • Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Fang Wen

The proposed robust mutual learning demonstrates state-of-the-art performance on semantic segmentation in low-data regime.

Pseudo Label Semi-Supervised Semantic Segmentation

Paper
Add Code

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

6 code implementations • CVPR 2022 • Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo

By further pretraining on the larger dataset ImageNet-21K, we achieve 87. 5% Top-1 accuracy on ImageNet-1K and high segmentation performance on ADE20K with 55. 7 mIoU.

Ranked #25 on Semantic Segmentation on ADE20K val

Image Classification Semantic Segmentation

5,245

Paper
Code

Instance-wise Hard Negative Example Generation for Contrastive Learning in Unpaired Image-to-Image Translation

no code implementations • ICCV 2021 • Weilun Wang, Wengang Zhou, Jianmin Bao, Dong Chen, Houqiang Li

In this paper, we uncover that the negative examples play a critical role in the performance of contrastive learning for image translation.

Contrastive Learning Image-to-Image Translation +1

Paper
Add Code

Dual Path Learning for Domain Adaptation of Semantic Segmentation

1 code implementation • ICCV 2021 • Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Fang Wen, Wenqiang Zhang

In this paper, based on the observation that domain adaptation frameworks performed in the source and target domain are almost complementary in terms of image translation and SSL, we propose a novel dual path learning (DPL) framework to alleviate visual inconsistency.

Ranked #32 on Synthetic-to-Real Translation on GTAV-to-Cityscapes Labels

Domain Adaptation Segmentation +4

Paper
Code

Exploring Temporal Coherence for More General Video Face Forgery Detection

1 code implementation • ICCV 2021 • Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, Fang Wen

The first stage is a fully temporal convolution network (FTCN).

Ranked #4 on DeepFake Detection on FakeAVCeleb

DeepFake Detection

Paper
Code

Proteome-informed machine learning studies of cocaine addiction

1 code implementation • 17 Sep 2021 • Kaifu Gao, Dong Chen, Alfred J Robison, Guo-Wei Wei

Cocaine addiction accounts for a large portion of substance use disorders and threatens millions of lives worldwide.

BIG-bench Machine Learning

Paper
Code

Performance Evaluation of Deep Transfer Learning on Multiclass Identification of Common Weed Species in Cotton Production Systems

1 code implementation • 11 Oct 2021 • Dong Chen, Yuzhen Lu, Zhaojiang Li, Sierra Young

Precision weed management offers a promising solution for sustainable cropping systems through the use of chemical-reduced/non-chemical robotic weeding techniques, which apply suitable control tactics to individual weeds.

Benchmarking Management +1

Paper
Code

Multi-agent Reinforcement Learning for Cooperative Lane Changing of Connected and Autonomous Vehicles in Mixed Traffic

no code implementations • 11 Nov 2021 • Wei Zhou, Dong Chen, Jun Yan, Zhaojian Li, Huilin Yin, Wanchen Ge

In this paper, we formulate the lane-changing decision making of multiple AVs in a mixed-traffic highway environment as a multi-agent reinforcement learning (MARL) problem, where each AV makes lane-changing decisions based on the motions of both neighboring AVs and HDVs.

Autonomous Driving Decision Making +3

Paper
Add Code

PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers

1 code implementation • 24 Nov 2021 • Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

This paper explores a better prediction target for BERT pre-training of vision transformers.

Ranked #4 on Self-Supervised Image Classification on ImageNet (finetuned)

object-detection Object Detection +2

153

Paper
Code

Vector Quantized Diffusion Model for Text-to-Image Synthesis

2 code implementations • CVPR 2022 • Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo

Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.

Ranked #1 on Text-to-Image Generation on Oxford 102 Flowers (using extra training data)

Denoising Text-to-Image Generation

832

Paper
Code

General Facial Representation Learning in a Visual-Linguistic Manner

2 code implementations • CVPR 2022 • Yinglin Zheng, Hao Yang, Ting Zhang, Jianmin Bao, Dongdong Chen, Yangyu Huang, Lu Yuan, Dong Chen, Ming Zeng, Fang Wen

In this paper, we study the transfer performance of pre-trained models on face analysis tasks and introduce a framework, called FaRL, for general Facial Representation Learning in a visual-linguistic manner.

Ranked #1 on Face Parsing on CelebAMask-HQ (using extra training data)

Face Alignment Face Parsing +1

329

Paper
Code

StyleSwin: Transformer-based GAN for High-resolution Image Generation

1 code implementation • CVPR 2022 • BoWen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo

To this end, we believe that local attention is crucial to strike the balance between computational efficiency and modeling capacity.

Ranked #1 on Image Generation on CelebA 256x256 (FID metric)

Blocking Computational Efficiency +3

472

Paper
Code

Machine learning analysis of cocaine addiction informed by DAT, SERT, and NET-based interactome networks

1 code implementation • 1 Jan 2022 • Hongsong Feng, Kaifu Gao, Dong Chen, Alfred J Robison, Edmund Ellsworth, Guo-Wei Wei

Cocaine dependence is neurological and involves many interacting proteins in the interactome.

Drug Discovery

Paper
Code

Protecting Celebrities from DeepFake with Identity Consistency Transformer

1 code implementation • CVPR 2022 • Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Ting Zhang, Weiming Zhang, Nenghai Yu, Dong Chen, Fang Wen, Baining Guo

In this work we propose Identity Consistency Transformer, a novel face forgery detection method that focuses on high-level semantics, specifically identity information, and detecting a suspect face by finding identity inconsistency in inner and outer face regions.

Face Swapping

Paper
Code

Semi-Supervised Image-to-Image Translation using Latent Space Mapping

no code implementations • 29 Mar 2022 • Pan Zhang, Jianmin Bao, Ting Zhang, Dong Chen, Fang Wen

Thanks to the low dimensional feature space, it is easier to find the desired mapping function, resulting in improved quality of translation results as well as the stability of the translation model.

Image-to-Image Translation Translation

Paper
Add Code

Large-Scale Pre-training for Person Re-identification with Noisy Labels

2 code implementations • CVPR 2022 • Dengpan Fu, Dongdong Chen, Hao Yang, Jianmin Bao, Lu Yuan, Lei Zhang, Houqiang Li, Fang Wen, Dong Chen

Since theses ID labels automatically derived from tracklets inevitably contain noises, we develop a large-scale Pre-training framework utilizing Noisy Labels (PNL), which consists of three learning modules: supervised Re-ID learning, prototype-based contrastive learning, and label-guided contrastive learning.

Ranked #7 on Person Re-Identification on CUHK03

Contrastive Learning Multi-Object Tracking +3

214

Paper
Code

Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic Review

1 code implementation • 10 Apr 2022 • Ebenezer Olaniyi, Dong Chen, Yuzhen Lu, Yanbo Huang

In agricultural image analysis, optimal model performance is keenly pursued for better fulfilling visual recognition tasks (e. g., image classification, segmentation, object detection and localization), in the presence of challenges with biological variability and unstructured environments.

Generative Adversarial Network Image Augmentation +4

Paper
Code

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

1 code implementation • 25 Apr 2022 • Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen

We propose pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality.

Image-to-Image Translation Neural Rendering +1

Paper
Code

Pretraining is All You Need for Image-to-Image Translation

2 code implementations • 25 May 2022 • Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, Fang Wen

We propose to use pretraining to boost general image-to-image translation.

Ranked #1 on Sketch-to-Image Translation on COCO-Stuff

Image-to-Image Translation Sketch-to-Image Translation +2

470

Paper
Code

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

1 code implementation • 27 May 2022 • Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

These properties, which we aggregately refer to as optimization friendliness, are identified and analyzed by a set of attention- and optimization-related diagnosis tools.

Ranked #2 on Instance Segmentation on COCO test-dev (using extra training data)

Contrastive Learning Image Classification +5

217

Paper
Code

Improved Vector Quantized Diffusion Models

1 code implementation • 31 May 2022 • Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

When trained on ImageNet, we dramatically improve the FID score from 11. 89 to 4. 83, demonstrating the superiority of our proposed techniques.

Denoising Image Generation

832

Paper
Code

Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile

1 code implementation • 4 Jun 2022 • Dong Chen, Lingfei Wu, Siliang Tang, Xiao Yun, Bo Long, Yueting Zhuang

Moreover, when handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise on a corrupted dataset.

Few-Shot Learning

Paper
Code

I^2R-Net: Intra- and Inter-Human Relation Network for Multi-Person Pose Estimation

1 code implementation • 22 Jun 2022 • Yiwei Ding, Wenjin Deng, Yinglin Zheng, PengFei Liu, Meihong Wang, Xuan Cheng, Jianmin Bao, Dong Chen, Ming Zeng

In this paper, we present the Intra- and Inter-Human Relation Networks (I^2R-Net) for Multi-Person Pose Estimation.

Ranked #2 on Multi-Person Pose Estimation on OCHuman

Multi-Person Pose Estimation Relation +1

Paper
Code

Semantic Image Synthesis via Diffusion Models

3 code implementations • 30 Jun 2022 • Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li

Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks compared with Generative Adversarial Nets (GANs).

Denoising Image Generation

191

Paper
Code

Bootstrapped Masked Autoencoders for Vision BERT Pretraining

1 code implementation • 14 Jul 2022 • Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

The first design is motivated by the observation that using a pretrained MAE to extract the features as the BERT prediction target for masked tokens can achieve better pretraining performance.

Ranked #19 on Self-Supervised Image Classification on ImageNet (finetuned)

Object Detection Self-Supervised Image Classification +1

Paper
Code

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining

no code implementations • CVPR 2023 • Xiaoyi Dong, Jianmin Bao, Yinglin Zheng, Ting Zhang, Dongdong Chen, Hao Yang, Ming Zeng, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Second, masked self-distillation is also consistent with vision-language contrastive from the perspective of training objective as both utilize the visual encoder for feature aligning, and thus is able to learn local semantics getting indirect supervision from the language.

Representation Learning

Paper
Add Code

3DFaceShop: Explicitly Controllable 3D-Aware Portrait Generation

1 code implementation • 12 Sep 2022 • Junshu Tang, Bo Zhang, Binxin Yang, Ting Zhang, Dong Chen, Lizhuang Ma, Fang Wen

In contrast to the traditional avatar creation pipeline which is a costly process, contemporary generative approaches directly learn the data distribution from photographs.

3D Face Animation Disentanglement +3

205

Paper
Code

DigiFace-1M: 1 Million Digital Face Images for Face Recognition

1 code implementation • 5 Oct 2022 • Gwangbin Bae, Martin de La Gorce, Tadas Baltrusaitis, Charlie Hewitt, Dong Chen, Julien Valentin, Roberto Cipolla, Jingjing Shen

Such models are trained on large-scale datasets that contain millions of real human face images collected from the internet.

Ranked #2 on Synthetic Face Recognition on CPLFW (Accuracy metric)

Attribute Synthetic Data Generation +1

262

Paper
Code

Deep Data Augmentation for Weed Recognition Enhancement: A Diffusion Probabilistic Model and Transfer Learning Based Approach

1 code implementation • 18 Oct 2022 • Dong Chen, Xinda Qi, Yu Zheng, Yuzhen Lu, Zhaojian Li

In this paper, we present the first work of applying diffusion probabilistic models (also known as diffusion models) to generate high-quality synthetic weed images based on transfer learning.

Data Augmentation Management +1

Paper
Code

A Structure-Guided Diffusion Model for Large-Hole Image Completion

1 code implementation • 18 Nov 2022 • Daichi Horita, Jiaolong Yang, Dong Chen, Yuki Koyama, Kiyoharu Aizawa, Nicu Sebe

The structure generator generates an edge image representing plausible structures within the holes, which is then used for guiding the texture generation process.

Denoising Texture Synthesis

Paper
Code

SinDiffusion: Learning a Diffusion Model from a Single Natural Image

1 code implementation • 22 Nov 2022 • Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li

We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.

Ranked #1 on Image Generation on Places50

Denoising Image Generation +1

271

Paper
Code

Paint by Example: Exemplar-based Image Editing with Diffusion Models

2 code implementations • CVPR 2023 • Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen

Language-guided image editing has achieved great success recently.

Image Generation Image Manipulation

953

Paper
Code

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

1 code implementation • 7 Dec 2022 • Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, Nenghai Yu

We demonstrate for the first time that using a text2image model to generate images or zero-shot recognition model to filter noisily crawled images for different object categories is a feasible way to make Copy-Paste truly scalable.

Ranked #7 on Instance Segmentation on LVIS v1.0 val

Data Augmentation Instance Segmentation +5

Paper
Code

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

1 code implementation • 12 Dec 2022 • Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Shuyang Gu, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inference while its fine-tuning performance is not satisfactory.

185

Paper
Code

Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

no code implementations • CVPR 2023 • Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo

This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields.

Computational Efficiency

Paper
Add Code

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

1 code implementation • CVPR 2023 • BoWen Zhang, Chenyang Qi, Pan Zhang, Bo Zhang, HsiangTao Wu, Dong Chen, Qifeng Chen, Yong Wang, Fang Wen

In this work, we propose an ID-preserving talking head generation framework, which advances previous methods in two aspects.

Face Swapping Meta-Learning +1

487

Paper
Code

FreeEnricher: Enriching Face Landmarks without Additional Cost

no code implementations • 19 Dec 2022 • Yangyu Huang, Xi Chen, Jongyoo Kim, Hao Yang, Chong Li, Jiaolong Yang, Dong Chen

To evaluate our method, we manually label the dense landmarks on 300W testset.

Ranked #1 on Face Alignment on 300W

Face Alignment

Paper
Add Code

Improving CLIP Fine-tuning Performance

1 code implementation • ICCV 2023 • Yixuan Wei, Han Hu, Zhenda Xie, Ze Liu, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

Experiments suggest that the feature map distillation approach significantly boosts the fine-tuning performance of CLIP models on several typical downstream vision tasks.

object-detection Object Detection +1

217

Paper
Code

Hyneter: Hybrid Network Transformer for Object Detection

no code implementations • 18 Feb 2023 • Dong Chen, Duoqian Miao, Xuerong Zhao

In this paper, we point out that the essential differences between CNN-based and Transformer-based detectors, which cause the worse performance of small objects in Transformer-based methods, are the gap between local information and global dependencies in feature extraction and propagation.

Object object-detection +1

Paper
Add Code

O2RNet: Occluder-Occludee Relational Network for Robust Apple Detection in Clustered Orchard Environments

no code implementations • 8 Mar 2023 • Pengyu Chu, Zhaojian Li, Kaixiang Zhang, Dong Chen, Kyle Lammers, Renfu Lu

One key technology to fully enable efficient automated harvesting is accurate and robust apple detection, which is challenging due to complex orchard environments that involve varying lighting conditions and foliage/branch occlusions.

Paper
Add Code

Efficient Diffusion Training via Min-SNR Weighting Strategy

2 code implementations • ICCV 2023 • Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Denoising Image Generation +2

153

Paper
Code

IRGen: Generative Modeling for Image Retrieval

1 code implementation • 17 Mar 2023 • Yidan Zhang, Ting Zhang, Dong Chen, Yujing Wang, Qi Chen, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Baining Guo

While generative modeling has been ubiquitous in natural language processing and computer vision, its application to image retrieval remains unexplored.

Image Retrieval Retrieval

Paper
Code

CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning

1 code implementation • CVPR 2023 • Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Wenqiang Zhang

Our framework, termed as domain-aware sign language retrieval via Cross-lingual Contrastive learning or CiCo for short, outperforms the pioneering method by large margins on various datasets, e. g., +22. 4 T2V and +28. 0 V2T R@1 improvements on How2Sign dataset, and +13. 7 T2V and +17. 1 V2T R@1 improvements on PHOENIX-2014T dataset.

Ranked #1 on Sign Language Retrieval on CSL-Daily

Contrastive Learning Retrieval +5

196

Paper
Code

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

2 code implementations • ICCV 2023 • Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, Dong Chen

In this work, we investigate the problem of creating high-fidelity 3D content from only a single image.

Text to 3D

1,681

Paper
Code

Label-Efficient Learning in Agriculture: A Comprehensive Review

1 code implementation • 24 May 2023 • Jiajia Li, Dong Chen, Xinda Qi, Zhaojian Li, Yanbo Huang, Daniel Morris, Xiaobo Tan

In addition, a systematic review of various agricultural applications exploiting these label-efficient algorithms, such as precision agriculture, plant phenotyping, and postharvest quality assessment, is presented.

Active Learning Plant Phenotyping +2

Paper
Code

Communication-Efficient Decentralized Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control

1 code implementation • 4 Aug 2023 • Dong Chen, Kaixiang Zhang, Yongqiang Wang, Xunyuan Yin, Zhaojian Li, Dimitar Filev

Connected and autonomous vehicles (CAVs) promise next-gen transportation systems with enhanced safety, energy efficiency, and sustainability.

Autonomous Vehicles Multi-agent Reinforcement Learning +1

Paper
Code

Large Language Models and Foundation Models in Smart Agriculture: Basics, Opportunities, and Challenges

1 code implementation • 13 Aug 2023 • Jiajia Li, Mingle Xu, Lirong Xiang, Dong Chen, Weichao Zhuang, Xunyuan Yin, Zhaojian Li

These models are trained on a large amount of data from multiple domains and modalities.

Paper
Code

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

1 code implementation • 7 Sep 2023 • Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo

We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

Keypoint Detection

330

Paper
Code

Improving Vision Anomaly Detection with the Guidance of Language Modality

1 code implementation • 4 Oct 2023 • Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang

To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality, and then the latent space of vision modality will be learned with the guidance of the matrix.

Anomaly Detection Defect Detection +1

Paper
Code

SoybeanNet: Transformer-Based Convolutional Neural Network for Soybean Pod Counting from Unmanned Aerial Vehicle (UAV) Images

1 code implementation • 16 Oct 2023 • Jiajia Li, Raju Thada Magar, Dong Chen, Feng Lin, Dechun Wang, Xiang Yin, Weichao Zhuang, Zhaojian Li

Soybeans are a critical source of food, protein and oil, and thus have received extensive research aimed at enhancing their yield, refining cultivation practices, and advancing soybean breeding techniques.

Paper
Code

PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders

no code implementations • 8 Nov 2023 • Hezhen Hu, Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Lu Yuan, Dong Chen, Houqiang Li

Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID).

Person Re-Identification

Paper
Add Code

COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design

no code implementations • 28 Nov 2023 • Peidong Jia, Chenxuan Li, Yuhui Yuan, Zeyu Liu, Yichao Shen, Bohan Chen, Xingru Chen, Yinglin Zheng, Dong Chen, Ji Li, Xiaodong Xie, Shanghang Zhang, Baining Guo

Our COLE system comprises multiple fine-tuned Large Language Models (LLMs), Large Multimodal Models (LMMs), and Diffusion Models (DMs), each specifically tailored for design-aware layer-wise captioning, layout planning, reasoning, and the task of generating images and text.

Image Generation

Paper
Add Code

Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

no code implementations • 15 Dec 2023 • Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang

In this paper, we present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation.

3D Generation Image to 3D +1

Paper
Add Code

VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder

no code implementations • 18 Dec 2023 • Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, Baining Guo

The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net.

3D Generation Object +1

Paper
Add Code

Back-stepping Experience Replay with Application to Model-free Reinforcement Learning for a Soft Snake Robot

no code implementations • 21 Jan 2024 • Xinda Qi, Dong Chen, Zhaojian Li, Xiaobo Tan

In this paper, we propose a novel technique, Back-stepping Experience Replay (BER), that is compatible with arbitrary off-policy reinforcement learning (RL) algorithms.

Friction Reinforcement Learning (RL)

Paper
Add Code

CCA: Collaborative Competitive Agents for Image Editing

1 code implementation • 23 Jan 2024 • Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.

Paper
Code

EPSD: Early Pruning with Self-Distillation for Efficient Model Compression

no code implementations • 31 Jan 2024 • Dong Chen, Ning Liu, Yichen Zhu, Zhengping Che, Rui Ma, Fachao Zhang, Xiaofeng Mou, Yi Chang, Jian Tang

Instead of a simple combination of pruning and SD, EPSD enables the pruned network to favor SD by keeping more distillable weights before training to ensure better distillation of the pruned network.

Knowledge Distillation Network Pruning +1

Paper
Add Code

Drug resistance revealed by in silico deep mutational scanning and mutation tracker

no code implementations • 5 Mar 2024 • Dong Chen, Gengzhuo Liu, Hongyan Du, JunJie Wee, Rui Wang, Jiahui Chen, Jana Shen, Guo-Wei Wei

As COVID-19 enters its fifth year, it continues to pose a significant global health threat, with the constantly mutating SARS-CoV-2 virus challenging drug effectiveness.

Drug Discovery

Paper
Add Code

Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection

1 code implementation • 6 Mar 2024 • Jiajia Li, Dong Chen, Xunyuan Yin, Zhaojian Li

In this study, we assess the effectiveness of a semi-supervised learning framework for multi-class weed detection, employing two well-known object detection frameworks, namely FCOS and Faster-RCNN.

object-detection Object Detection +1

Paper
Code

SiGNN: A Spike-induced Graph Neural Network for Dynamic Graph Representation Learning

no code implementations • 11 Mar 2024 • Dong Chen, Shuai Zheng, Muhao Xu, Zhenfeng Zhu, Yao Zhao

In the domain of dynamic graph representation learning (DGRL), the efficient and comprehensive capture of temporal evolution within real-world networks is crucial.

Graph Representation Learning Node Classification

Paper
Add Code

Simplified Diffusion Schrödinger Bridge

1 code implementation • 21 Mar 2024 • Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo

This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance.

Paper
Code

CodeS: Natural Language to Code Repository via Multi-Layer Sketch

1 code implementation • 25 Mar 2024 • Daoguang Zan, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei guan, Zhiguang Yang, Yongji Wang, Qianxiang Wang, Lizhen Cui

For feedback-based evaluation, we develop a VSCode plugin for CodeS and engage 30 participants in conducting empirical studies.

Benchmarking

Paper
Code

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

no code implementations • 28 Mar 2024 • BoWen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guo

To address the problem, we introduce GaussianCube, a structured GS representation that is both powerful and efficient for generative modeling.

Paper
Add Code

WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

1 code implementation • 2 Apr 2024 • Chen Yang, Aaron Xuxiang Tian, Dong Chen, Tianyu Shi, Arsalan Heydarian

To enhance the scene diversity and stochasticity, the historical trajectory data is first preprocessed and encoded into latent space using Denoising Diffusion Probabilistic Models (DDPM) enhanced with Diffusion with Transformer (DiT) blocks.

Autonomous Driving Denoising +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.