Search Results for author: Dong Chen

Found 108 papers, 69 papers with code

Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification

no code implementations CVPR 2013 Dong Chen, Xudong Cao, Fang Wen, Jian Sun

Making a high-dimensional (e. g., 100K-dim) feature for face recognition seems not a good idea because it will bring difficulties on consequent training, computation, and storage.

Age-Invariant Face Recognition Face Verification +1

Neural Aggregation Network for Video Face Recognition

no code implementations CVPR 2017 Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua

The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition.

Face Recognition Face Verification

Supervised Transformer Network for Efficient Face Detection

no code implementations19 Jul 2016 Dong Chen, Gang Hua, Fang Wen, Jian Sun

For real-time performance, we run the cascaded network only on regions of interests produced from a boosting cascade face detector.

Face Detection Region Proposal +1

Towards Open-Set Identity Preserving Face Synthesis

no code implementations CVPR 2018 Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua

We then recombine the identity vector and the attribute vector to synthesize a new face of the subject with the extracted attribute.

Attribute Face Generation

A novel active learning framework for classification: using weighted rank aggregation to achieve multiple query criteria

no code implementations27 Sep 2018 Yu Zhao, Zhenhui Shi, Jingyang Zhang, Dong Chen, Lixu Gu

The proposed method serves as a heuristic means to select high-value samples of high scalability and generality and is implemented through a three-step process: (1) the transformation of the sample selection to sample ranking and scoring, (2) the computation of the self-adaptive weights of each criterion, and (3) the weighted aggregation of each sample rank list.

Active Learning General Classification

Exploring Hypergraph Representation on Face Anti-spoofing Beyond 2D Attacks

no code implementations28 Nov 2018 Wei Hu, Gusi Te, Ju He, Dong Chen, Zongming Guo

Face anti-spoofing plays a crucial role in protecting face recognition systems from various attacks.

Face Anti-Spoofing Face Recognition

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set

4 code implementations20 Mar 2019 Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, Xin Tong

Recently, deep learning based 3D face reconstruction methods have shown promising results in both quality and efficiency. However, training deep neural networks typically requires a large volume of data, whereas face images with ground-truth 3D face shapes are scarce.

Ranked #3 on 3D Face Reconstruction on Florence (RMSE Cooperative metric)

3D Face Reconstruction Weakly-supervised Learning

Face Parsing with RoI Tanh-Warping

2 code implementations CVPR 2019 Jinpeng Lin, Hao Yang, Dong Chen, Ming Zeng, Fang Wen, Lu Yuan

It uses hierarchical local based method for inner facial components and global methods for outer facial components.

Face Parsing

Face X-ray for More General Face Forgery Detection

4 code implementations CVPR 2020 Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, Baining Guo

For this reason, face X-ray provides an effective way for detecting forgery generated by most existing face manipulation algorithms.

DeepFake Detection Face Swapping

FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping

10 code implementations31 Dec 2019 Lingzhi Li, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen

We propose a novel attributes encoder for extracting multi-level target face attributes, and a new generator with carefully designed Adaptive Attentional Denormalization (AAD) layers to adaptively integrate the identity and the attributes for face synthesis.

Face Generation Face Swapping +1

Table-Top Scene Analysis Using Knowledge-Supervised MCMC

no code implementations19 Feb 2020 Ziyuan Liu, Dong Chen, Kai M. Wurm, Georg von Wichert

Our approach to generate scene graphs is probabilistic: Uncertainty in the object poses is addressed by a probabilistic sensor model that is embedded in a data driven MCMC process.

Descriptive Object

Online Semantic Exploration of Indoor Maps

no code implementations21 Feb 2020 Ziyuan Liu, Dong Chen, Georg von Wichert

In this paper we propose a method to extract an abstracted floor plan from typical grid maps using Bayesian reasoning.

GIQA: Generated Image Quality Assessment

1 code implementation ECCV 2020 Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

Generative adversarial networks (GANs) have achieved impressive results today, but not all generated images are perfect.

Image Quality Assessment

Curved Buildings Reconstruction from Airborne LiDAR Data by Matching and Deforming Geometric Primitives

no code implementations22 Mar 2020 Jingwei Song, Shaobo Xia, Jun Wang, Dong Chen

To this end, we propose a new framework for curved building reconstruction via assembling and deforming geometric primitives.

Bringing Old Photos Back to Life

7 code implementations CVPR 2020 Zi-Yu Wan, Bo Zhang, Dong-Dong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize.

Image Restoration Translation

Uncertainty Quantification for Hyperspectral Image Denoising Frameworks based on Low-rank Matrix Approximation

1 code implementation23 Apr 2020 Jingwei Song, Shaobo Xia, Jun Wang, Mitesh Patel, Dong Chen

Sliding-window based low-rank matrix approximation (LRMA) is a technique widely used in hyperspectral images (HSIs) denoising or completion.

Hyperspectral Image Denoising Image Denoising +1

Deep 3D Portrait from a Single Image

1 code implementation CVPR 2020 Sicheng Xu, Jiaolong Yang, Dong Chen, Fang Wen, Yu Deng, Yunde Jia, Xin Tong

We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images.

Face Model Stereo Matching

Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition

1 code implementation ECCV 2020 Di Hu, Xuhong LI, Lichao Mou, Pu Jin, Dong Chen, Liping Jing, Xiaoxiang Zhu, Dejing Dou

With the help of this dataset, we evaluate three proposed approaches for transferring the sound event knowledge to the aerial scene recognition task in a multimodal learning framework, and show the benefit of exploiting the audio information for the aerial scene recognition.

Scene Recognition

PriorGAN: Real Data Prior for Generative Adversarial Nets

1 code implementation30 Jun 2020 Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

To address these two issues, we propose a novel prior that captures the whole real data distribution for GANs, which are called PriorGANs.

Unified Representation Learning for Cross Model Compatibility

no code implementations11 Aug 2020 Chien-Yi Wang, Ya-Liang Chang, Shang-Ta Yang, Dong Chen, Shang-Hong Lai

We propose a unified representation learning framework to address the Cross Model Compatibility (CMC) problem in the context of visual search applications.

Face Identification Face Recognition +2

Old Photo Restoration via Deep Latent Space Translation

8 code implementations14 Sep 2020 Zi-Yu Wan, Bo Zhang, Dong-Dong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen

Unlike conventional restoration tasks that can be solved through supervised learning, the degradation in real photos is complex and the domain gap between synthetic images and real old photos makes the network fail to generalize.

Image Restoration Translation

GreedyFool: Distortion-Aware Sparse Adversarial Attack

1 code implementation NeurIPS 2020 Xiaoyi Dong, Dongdong Chen, Jianmin Bao, Chuan Qin, Lu Yuan, Weiming Zhang, Nenghai Yu, Dong Chen

Sparse adversarial samples are a special branch of adversarial samples that can fool the target model by only perturbing a few pixels.

Adversarial Attack

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

no code implementations22 Nov 2020 Shuyang Gu, Jianmin Bao, Dong Chen

A key challenge in video enhancement and action recognition is to fuse useful information from neighboring frames.

Action Recognition Denoising +3

PowerNet: Multi-agent Deep Reinforcement Learning for Scalable Powergrid Control

no code implementations24 Nov 2020 Dong Chen, Kaian Chen. Zhaojian Li, Tianshu Chu, Rui Yao, Feng Qiu, Kaixiang Lin

Specifically, we consider the decentralized inverter-based secondary voltage control problem in distributed generators (DGs), which is first formulated as a cooperative multi-agent reinforcement learning (MARL) problem.

Multi-agent Reinforcement Learning reinforcement-learning +1

Identity-Driven DeepFake Detection

no code implementations7 Dec 2020 Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Dong Chen, Fang Wen, Baining Guo

Our approach takes as input the suspect image/video as well as the target identity information (a reference image or video).

DeepFake Detection Face Swapping

Evidence of topological nodal lines and surface states in the centrosymmetric superconductor SnTaS2

no code implementations7 Dec 2020 Wenqing Chen, Lulu Liu, Wentao Yang, Dong Chen, Zhengtai Liu, Yaobo Huang, Tong Zhang, Haijun Zhang, Zhonghao Liu, D. W. Shen

Utilizing angle-resolved photoemission spectroscopy and first-principles calculations, here, we demonstrate the existence of topological nodal-line states and drumheadlike surface states in centrosymmetric superconductor SnTaS2, which is a type-II superconductor with a critical transition temperature of about 3 K. The valence bands from Ta 5d orbitals and the conduction bands from Sn 5p orbitals cross each other, forming two nodal lines in the vicinity of the Fermi energy without the inclusion of spin-orbit coupling (SOC), protected by the spatial-inversion symmetry and time-reversal symmetry.

Superconductivity

Unsupervised Pre-training for Person Re-identification

1 code implementation CVPR 2021 Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, Dong Chen

In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation.

 Ranked #1 on Person Re-Identification on Market-1501 (using extra training data)

Data Augmentation Person Re-Identification +1

Robust Meta-learning with Noise via Eigen-Reptile

no code implementations1 Jan 2021 Dong Chen, Lingfei Wu, Siliang Tang, Fangli Xu, Juncheng Li, Chang Zong, Chilie Tan, Yueting Zhuang

In particular, we first cast the meta-overfitting problem (overfitting on sampling and label noise) as a gradient noise problem since few available samples cause meta-learner to overfit on existing examples (clean or corrupted) of an individual task at every gradient step.

Few-Shot Learning

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion

1 code implementation CVPR 2021 Chulin Xie, Chuxin Wang, Bo Zhang, Hao Yang, Dong Chen, Fang Wen

In this paper, we proposed a novel Style-based Point Generator with Adversarial Rendering (SpareNet) for point cloud completion.

 Ranked #1 on Point Cloud Completion on ShapeNet (Earth Mover's Distance metric)

Point Cloud Completion

Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression

1 code implementation22 Mar 2021 Dong Chen, Duoqian Miao

In this paper, we first present an evaluation-feedback module, which is proposed to consist of evaluation system and feedback mechanism.

object-detection Object Detection +1

High-Fidelity and Arbitrary Face Editing

no code implementations CVPR 2021 Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

However, we observe that the generator tends to find a tricky way to hide information from the original image to satisfy the constraint of cycle consistency, making it impossible to maintain the rich details (e. g., wrinkles and moles) of non-editing areas.

Attribute Vocal Bursts Intensity Prediction

Deep Multi-agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic

3 code implementations12 May 2021 Dong Chen, Mohammad Hajidavalloo, Zhaojian Li, Kaian Chen, Yongqiang Wang, Longsheng Jiang, Yue Wang

On-ramp merging is a challenging task for autonomous vehicles (AVs), especially in mixed traffic where AVs coexist with human-driven vehicles (HDVs).

Autonomous Vehicles reinforcement-learning +1

Robust Mutual Learning for Semi-supervised Semantic Segmentation

no code implementations1 Jun 2021 Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Fang Wen

The proposed robust mutual learning demonstrates state-of-the-art performance on semantic segmentation in low-data regime.

Pseudo Label Semi-Supervised Semantic Segmentation

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

6 code implementations CVPR 2022 Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo

By further pretraining on the larger dataset ImageNet-21K, we achieve 87. 5% Top-1 accuracy on ImageNet-1K and high segmentation performance on ADE20K with 55. 7 mIoU.

Image Classification Semantic Segmentation

Dual Path Learning for Domain Adaptation of Semantic Segmentation

1 code implementation ICCV 2021 Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Fang Wen, Wenqiang Zhang

In this paper, based on the observation that domain adaptation frameworks performed in the source and target domain are almost complementary in terms of image translation and SSL, we propose a novel dual path learning (DPL) framework to alleviate visual inconsistency.

Domain Adaptation Segmentation +4

Proteome-informed machine learning studies of cocaine addiction

1 code implementation17 Sep 2021 Kaifu Gao, Dong Chen, Alfred J Robison, Guo-Wei Wei

Cocaine addiction accounts for a large portion of substance use disorders and threatens millions of lives worldwide.

BIG-bench Machine Learning

Performance Evaluation of Deep Transfer Learning on Multiclass Identification of Common Weed Species in Cotton Production Systems

1 code implementation11 Oct 2021 Dong Chen, Yuzhen Lu, Zhaojiang Li, Sierra Young

Precision weed management offers a promising solution for sustainable cropping systems through the use of chemical-reduced/non-chemical robotic weeding techniques, which apply suitable control tactics to individual weeds.

Benchmarking Management +1

Multi-agent Reinforcement Learning for Cooperative Lane Changing of Connected and Autonomous Vehicles in Mixed Traffic

no code implementations11 Nov 2021 Wei Zhou, Dong Chen, Jun Yan, Zhaojian Li, Huilin Yin, Wanchen Ge

In this paper, we formulate the lane-changing decision making of multiple AVs in a mixed-traffic highway environment as a multi-agent reinforcement learning (MARL) problem, where each AV makes lane-changing decisions based on the motions of both neighboring AVs and HDVs.

Autonomous Driving Decision Making +3

Vector Quantized Diffusion Model for Text-to-Image Synthesis

2 code implementations CVPR 2022 Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo

Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.

 Ranked #1 on Text-to-Image Generation on Oxford 102 Flowers (using extra training data)

Denoising Text-to-Image Generation

General Facial Representation Learning in a Visual-Linguistic Manner

2 code implementations CVPR 2022 Yinglin Zheng, Hao Yang, Ting Zhang, Jianmin Bao, Dongdong Chen, Yangyu Huang, Lu Yuan, Dong Chen, Ming Zeng, Fang Wen

In this paper, we study the transfer performance of pre-trained models on face analysis tasks and introduce a framework, called FaRL, for general Facial Representation Learning in a visual-linguistic manner.

 Ranked #1 on Face Parsing on CelebAMask-HQ (using extra training data)

Face Alignment Face Parsing +1

StyleSwin: Transformer-based GAN for High-resolution Image Generation

1 code implementation CVPR 2022 BoWen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo

To this end, we believe that local attention is crucial to strike the balance between computational efficiency and modeling capacity.

 Ranked #1 on Image Generation on CelebA 256x256 (FID metric)

Blocking Computational Efficiency +3

Protecting Celebrities from DeepFake with Identity Consistency Transformer

1 code implementation CVPR 2022 Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Ting Zhang, Weiming Zhang, Nenghai Yu, Dong Chen, Fang Wen, Baining Guo

In this work we propose Identity Consistency Transformer, a novel face forgery detection method that focuses on high-level semantics, specifically identity information, and detecting a suspect face by finding identity inconsistency in inner and outer face regions.

Face Swapping

Semi-Supervised Image-to-Image Translation using Latent Space Mapping

no code implementations29 Mar 2022 Pan Zhang, Jianmin Bao, Ting Zhang, Dong Chen, Fang Wen

Thanks to the low dimensional feature space, it is easier to find the desired mapping function, resulting in improved quality of translation results as well as the stability of the translation model.

Image-to-Image Translation Translation

Large-Scale Pre-training for Person Re-identification with Noisy Labels

2 code implementations CVPR 2022 Dengpan Fu, Dongdong Chen, Hao Yang, Jianmin Bao, Lu Yuan, Lei Zhang, Houqiang Li, Fang Wen, Dong Chen

Since theses ID labels automatically derived from tracklets inevitably contain noises, we develop a large-scale Pre-training framework utilizing Noisy Labels (PNL), which consists of three learning modules: supervised Re-ID learning, prototype-based contrastive learning, and label-guided contrastive learning.

Contrastive Learning Multi-Object Tracking +3

Generative Adversarial Networks for Image Augmentation in Agriculture: A Systematic Review

1 code implementation10 Apr 2022 Ebenezer Olaniyi, Dong Chen, Yuzhen Lu, Yanbo Huang

In agricultural image analysis, optimal model performance is keenly pursued for better fulfilling visual recognition tasks (e. g., image classification, segmentation, object detection and localization), in the presence of challenges with biological variability and unstructured environments.

Generative Adversarial Network Image Augmentation +4

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

1 code implementation25 Apr 2022 Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen

We propose pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality.

Image-to-Image Translation Neural Rendering +1

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

1 code implementation27 May 2022 Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

These properties, which we aggregately refer to as optimization friendliness, are identified and analyzed by a set of attention- and optimization-related diagnosis tools.

Ranked #2 on Instance Segmentation on COCO test-dev (using extra training data)

Contrastive Learning Image Classification +5

Improved Vector Quantized Diffusion Models

1 code implementation31 May 2022 Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

When trained on ImageNet, we dramatically improve the FID score from 11. 89 to 4. 83, demonstrating the superiority of our proposed techniques.

Denoising Image Generation

Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile

1 code implementation4 Jun 2022 Dong Chen, Lingfei Wu, Siliang Tang, Xiao Yun, Bo Long, Yueting Zhuang

Moreover, when handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise on a corrupted dataset.

Few-Shot Learning

Semantic Image Synthesis via Diffusion Models

3 code implementations30 Jun 2022 Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li

Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks compared with Generative Adversarial Nets (GANs).

Denoising Image Generation

Bootstrapped Masked Autoencoders for Vision BERT Pretraining

1 code implementation14 Jul 2022 Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

The first design is motivated by the observation that using a pretrained MAE to extract the features as the BERT prediction target for masked tokens can achieve better pretraining performance.

Object Detection Self-Supervised Image Classification +1

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining

no code implementations CVPR 2023 Xiaoyi Dong, Jianmin Bao, Yinglin Zheng, Ting Zhang, Dongdong Chen, Hao Yang, Ming Zeng, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Second, masked self-distillation is also consistent with vision-language contrastive from the perspective of training objective as both utilize the visual encoder for feature aligning, and thus is able to learn local semantics getting indirect supervision from the language.

Representation Learning

3DFaceShop: Explicitly Controllable 3D-Aware Portrait Generation

1 code implementation12 Sep 2022 Junshu Tang, Bo Zhang, Binxin Yang, Ting Zhang, Dong Chen, Lizhuang Ma, Fang Wen

In contrast to the traditional avatar creation pipeline which is a costly process, contemporary generative approaches directly learn the data distribution from photographs.

3D Face Animation Disentanglement +3

Deep Data Augmentation for Weed Recognition Enhancement: A Diffusion Probabilistic Model and Transfer Learning Based Approach

1 code implementation18 Oct 2022 Dong Chen, Xinda Qi, Yu Zheng, Yuzhen Lu, Zhaojian Li

In this paper, we present the first work of applying diffusion probabilistic models (also known as diffusion models) to generate high-quality synthetic weed images based on transfer learning.

Data Augmentation Management +1

A Structure-Guided Diffusion Model for Large-Hole Image Completion

1 code implementation18 Nov 2022 Daichi Horita, Jiaolong Yang, Dong Chen, Yuki Koyama, Kiyoharu Aizawa, Nicu Sebe

The structure generator generates an edge image representing plausible structures within the holes, which is then used for guiding the texture generation process.

Denoising Texture Synthesis

SinDiffusion: Learning a Diffusion Model from a Single Natural Image

1 code implementation22 Nov 2022 Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li

We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.

Denoising Image Generation +1

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

1 code implementation7 Dec 2022 Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, Nenghai Yu

We demonstrate for the first time that using a text2image model to generate images or zero-shot recognition model to filter noisily crawled images for different object categories is a feasible way to make Copy-Paste truly scalable.

Data Augmentation Instance Segmentation +5

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

1 code implementation12 Dec 2022 Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Shuyang Gu, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inference while its fine-tuning performance is not satisfactory.

Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

no code implementations CVPR 2023 Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo

This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields.

Computational Efficiency

Improving CLIP Fine-tuning Performance

1 code implementation ICCV 2023 Yixuan Wei, Han Hu, Zhenda Xie, Ze Liu, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

Experiments suggest that the feature map distillation approach significantly boosts the fine-tuning performance of CLIP models on several typical downstream vision tasks.

object-detection Object Detection +1

Hyneter: Hybrid Network Transformer for Object Detection

no code implementations18 Feb 2023 Dong Chen, Duoqian Miao, Xuerong Zhao

In this paper, we point out that the essential differences between CNN-based and Transformer-based detectors, which cause the worse performance of small objects in Transformer-based methods, are the gap between local information and global dependencies in feature extraction and propagation.

Object object-detection +1

O2RNet: Occluder-Occludee Relational Network for Robust Apple Detection in Clustered Orchard Environments

no code implementations8 Mar 2023 Pengyu Chu, Zhaojian Li, Kaixiang Zhang, Dong Chen, Kyle Lammers, Renfu Lu

One key technology to fully enable efficient automated harvesting is accurate and robust apple detection, which is challenging due to complex orchard environments that involve varying lighting conditions and foliage/branch occlusions.

Efficient Diffusion Training via Min-SNR Weighting Strategy

2 code implementations ICCV 2023 Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Denoising Image Generation +2

IRGen: Generative Modeling for Image Retrieval

1 code implementation17 Mar 2023 Yidan Zhang, Ting Zhang, Dong Chen, Yujing Wang, Qi Chen, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Fan Yang, Mao Yang, Qingmin Liao, Baining Guo

While generative modeling has been ubiquitous in natural language processing and computer vision, its application to image retrieval remains unexplored.

Image Retrieval Retrieval

CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning

1 code implementation CVPR 2023 Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Wenqiang Zhang

Our framework, termed as domain-aware sign language retrieval via Cross-lingual Contrastive learning or CiCo for short, outperforms the pioneering method by large margins on various datasets, e. g., +22. 4 T2V and +28. 0 V2T R@1 improvements on How2Sign dataset, and +13. 7 T2V and +17. 1 V2T R@1 improvements on PHOENIX-2014T dataset.

Contrastive Learning Retrieval +5

Label-Efficient Learning in Agriculture: A Comprehensive Review

1 code implementation24 May 2023 Jiajia Li, Dong Chen, Xinda Qi, Zhaojian Li, Yanbo Huang, Daniel Morris, Xiaobo Tan

In addition, a systematic review of various agricultural applications exploiting these label-efficient algorithms, such as precision agriculture, plant phenotyping, and postharvest quality assessment, is presented.

Active Learning Plant Phenotyping +2

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

1 code implementation7 Sep 2023 Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo

We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

Keypoint Detection

Improving Vision Anomaly Detection with the Guidance of Language Modality

1 code implementation4 Oct 2023 Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang

To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality, and then the latent space of vision modality will be learned with the guidance of the matrix.

Anomaly Detection Defect Detection +1

SoybeanNet: Transformer-Based Convolutional Neural Network for Soybean Pod Counting from Unmanned Aerial Vehicle (UAV) Images

1 code implementation16 Oct 2023 Jiajia Li, Raju Thada Magar, Dong Chen, Feng Lin, Dechun Wang, Xiang Yin, Weichao Zhuang, Zhaojian Li

Soybeans are a critical source of food, protein and oil, and thus have received extensive research aimed at enhancing their yield, refining cultivation practices, and advancing soybean breeding techniques.

PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders

no code implementations8 Nov 2023 Hezhen Hu, Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Lu Yuan, Dong Chen, Houqiang Li

Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID).

Person Re-Identification

COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design

no code implementations28 Nov 2023 Peidong Jia, Chenxuan Li, Yuhui Yuan, Zeyu Liu, Yichao Shen, Bohan Chen, Xingru Chen, Yinglin Zheng, Dong Chen, Ji Li, Xiaodong Xie, Shanghang Zhang, Baining Guo

Our COLE system comprises multiple fine-tuned Large Language Models (LLMs), Large Multimodal Models (LMMs), and Diffusion Models (DMs), each specifically tailored for design-aware layer-wise captioning, layout planning, reasoning, and the task of generating images and text.

Image Generation

Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

no code implementations15 Dec 2023 Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang

In this paper, we present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation.

3D Generation Image to 3D +1

Back-stepping Experience Replay with Application to Model-free Reinforcement Learning for a Soft Snake Robot

no code implementations21 Jan 2024 Xinda Qi, Dong Chen, Zhaojian Li, Xiaobo Tan

In this paper, we propose a novel technique, Back-stepping Experience Replay (BER), that is compatible with arbitrary off-policy reinforcement learning (RL) algorithms.

Friction Reinforcement Learning (RL)

CCA: Collaborative Competitive Agents for Image Editing

1 code implementation23 Jan 2024 Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.

EPSD: Early Pruning with Self-Distillation for Efficient Model Compression

no code implementations31 Jan 2024 Dong Chen, Ning Liu, Yichen Zhu, Zhengping Che, Rui Ma, Fachao Zhang, Xiaofeng Mou, Yi Chang, Jian Tang

Instead of a simple combination of pruning and SD, EPSD enables the pruned network to favor SD by keeping more distillable weights before training to ensure better distillation of the pruned network.

Knowledge Distillation Network Pruning +1

Drug resistance revealed by in silico deep mutational scanning and mutation tracker

no code implementations5 Mar 2024 Dong Chen, Gengzhuo Liu, Hongyan Du, JunJie Wee, Rui Wang, Jiahui Chen, Jana Shen, Guo-Wei Wei

As COVID-19 enters its fifth year, it continues to pose a significant global health threat, with the constantly mutating SARS-CoV-2 virus challenging drug effectiveness.

Drug Discovery

Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection

1 code implementation6 Mar 2024 Jiajia Li, Dong Chen, Xunyuan Yin, Zhaojian Li

In this study, we assess the effectiveness of a semi-supervised learning framework for multi-class weed detection, employing two well-known object detection frameworks, namely FCOS and Faster-RCNN.

object-detection Object Detection +1

SiGNN: A Spike-induced Graph Neural Network for Dynamic Graph Representation Learning

no code implementations11 Mar 2024 Dong Chen, Shuai Zheng, Muhao Xu, Zhenfeng Zhu, Yao Zhao

In the domain of dynamic graph representation learning (DGRL), the efficient and comprehensive capture of temporal evolution within real-world networks is crucial.

Graph Representation Learning Node Classification

Simplified Diffusion Schrödinger Bridge

1 code implementation21 Mar 2024 Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo

This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance.

CodeS: Natural Language to Code Repository via Multi-Layer Sketch

1 code implementation25 Mar 2024 Daoguang Zan, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei guan, Zhiguang Yang, Yongji Wang, Qianxiang Wang, Lizhen Cui

For feedback-based evaluation, we develop a VSCode plugin for CodeS and engage 30 participants in conducting empirical studies.

Benchmarking

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

no code implementations28 Mar 2024 BoWen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guo

To address the problem, we introduce GaussianCube, a structured GS representation that is both powerful and efficient for generative modeling.

WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

1 code implementation2 Apr 2024 Chen Yang, Aaron Xuxiang Tian, Dong Chen, Tianyu Shi, Arsalan Heydarian

To enhance the scene diversity and stochasticity, the historical trajectory data is first preprocessed and encoded into latent space using Denoising Diffusion Probabilistic Models (DDPM) enhanced with Diffusion with Transformer (DiT) blocks.

Autonomous Driving Denoising +1

Cannot find the paper you are looking for? You can Submit a new open access paper.