Search Results for author: Cheng Lu

Found 40 papers, 22 papers with code

Energy Model-based Accurate Shapley Value Estimation for Interpretable Deep Learning Predictive Modelling

no code implementations • 1 Apr 2024 • Cheng Lu, Jiusun Zeng, Yu Xia, Jinhui Cai, Shihua Luo

As a favorable tool for explainable artificial intelligence (XAI), Shapley value has been widely used to interpret deep learning based predictive models.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI)

Paper
Add Code

PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion

no code implementations • 3 Mar 2024 • Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian

In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human perception.

Voice Conversion

Paper
Add Code

Towards Efficient and Exact Optimization of Language Model Alignment

1 code implementation • 1 Feb 2024 • Haozhe Ji, Cheng Lu, Yilin Niu, Pei Ke, Hongning Wang, Jun Zhu, Jie Tang, Minlie Huang

We prove that EXO is guaranteed to optimize in the same direction as the RL algorithms asymptotically for arbitary parametrization of the policy, while enables efficient optimization by circumventing the complexities associated with RL algorithms.

Language Modelling Reinforcement Learning (RL)

Paper
Code

Anything in Any Scene: Photorealistic Video Object Insertion

no code implementations • 30 Jan 2024 • Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang, Yichen Guan, Xiaoyin Zheng, Tao Wang, Cheng Lu

Our proposed general framework encompasses three key processes: 1) integrating a realistic object into a given scene video with proper placement to ensure geometric realism; 2) estimating the sky and environmental lighting distribution and simulating realistic shadows to enhance the light realism; 3) employing a style transfer network that refines the final video output to maximize photorealism.

Data Augmentation Object +2

Paper
Add Code

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition

no code implementations • 19 Jan 2024 • Yong Wang, Cheng Lu, Hailun Lian, Yan Zhao, Björn Schuller, Yuan Zong, Wenming Zheng

These segment-level patches are then encoded using a stack of Swin blocks, in which a local window Transformer is utilized to explore local inter-frame emotional information across frame patches of each segment patch.

Speech Emotion Recognition

Paper
Add Code

Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation

no code implementations • 18 Jan 2024 • Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn Schuller, Wenming Zheng

In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers.

Domain Adaptation Speech Emotion Recognition

Paper
Add Code

The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing

no code implementations • 2 Nov 2023 • Shen Nie, Hanzhong Allan Guo, Cheng Lu, Yuhao Zhou, Chenyu Zheng, Chongxuan Li

We present a unified probabilistic formulation for diffusion-based image editing, where a latent variable is edited in a task-specific manner and generally deviates from the corresponding marginal distribution induced by the original stochastic or ordinary differential equation (SDE or ODE).

Image-to-Image Translation

Paper
Add Code

DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics

1 code implementation • NeurIPS 2023 • Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu

In this work, we propose a novel formulation towards the optimal parameterization during sampling that minimizes the first-order discretization error of the ODE solution.

Image Generation

Paper
Code

Score Regularized Policy Optimization through Diffusion Behavior

1 code implementation • 11 Oct 2023 • Huayu Chen, Cheng Lu, Zhengyi Wang, Hang Su, Jun Zhu

Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling, which excels at representing heterogeneous behavior policies.

D4RL

Paper
Code

Learning to Rank Onset-Occurring-Offset Representations for Micro-Expression Recognition

no code implementations • 7 Oct 2023 • Jie Zhu, Yuan Zong, Jingang Shi, Cheng Lu, Hongli Chang, Wenming Zheng

This paper focuses on the research of micro-expression recognition (MER) and proposes a flexible and reliable deep learning method called learning to rank onset-occurring-offset representations (LTR3O).

Learning-To-Rank Micro Expression Recognition +1

Paper
Add Code

ChatGPT Informed Graph Neural Network for Stock Movement Prediction

1 code implementation • 28 May 2023 • Zihan Chen, Lei Nico Zheng, Cheng Lu, Jialu Yuan, Di Zhu

However, its potential for inferring dynamic network structures from temporal textual data, specifically financial news, remains an unexplored frontier.

Paper
Code

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

2 code implementations • NeurIPS 2023 • Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu

In comparison, VSD works well with various CFG weights as ancestral sampling from diffusion models and simultaneously improves the diversity and sample quality with a common CFG weight (i. e., $7. 5$).

3D Generation Text to 3D

5,591

Paper
Code

Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs

1 code implementation • 6 May 2023 • Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu

The probability flow ordinary differential equation (ODE) of diffusion models (i. e., diffusion ODEs) is a particular case of continuous normalizing flows (CNFs), which enables deterministic inference and exact likelihood evaluation.

Ranked #1 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation

Paper
Code

Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning

3 code implementations • 25 Apr 2023 • Cheng Lu, Huayu Chen, Jianfei Chen, Hang Su, Chongxuan Li, Jun Zhu

The main challenge for this setting is that the intermediate guidance during the diffusion sampling procedure, which is jointly defined by the sampling distribution and the energy function, is unknown and is hard to estimate.

D4RL Image Generation +1

2,505

Paper
Code

Privileged Prior Information Distillation for Image Matting

no code implementations • 25 Nov 2022 • Cheng Lyu, Jiake Xie, Bo Xu, Cheng Lu, Han Huang, Xin Huang, Ming Wu, Chuang Zhang, Yong Tang

Performance of trimap-free image matting methods is limited when trying to decouple the deterministic and undetermined regions, especially in the scenes where foregrounds are semantically ambiguous, chromaless, or high transmittance.

Image Matting

Paper
Add Code

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

1 code implementation • 2 Nov 2022 • Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

The commonly-used fast sampler for guided sampling is DDIM, a first-order diffusion ODE solver that generally needs 100 to 250 steps for high-quality samples.

Text-to-Image Generation

1,380

Paper
Code

Speech Emotion Recognition via an Attentive Time-Frequency Neural Network

no code implementations • 22 Oct 2022 • Cheng Lu, Wenming Zheng, Hailun Lian, Yuan Zong, Chuangao Tang, Sunan Li, Yan Zhao

The F-Encoder and T-Encoder model the correlations within frequency bands and time frames, respectively, and they are embedded into a time-frequency joint learning strategy to obtain the time-frequency patterns for speech emotions.

Speech Emotion Recognition

Paper
Add Code

Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

1 code implementation • 29 Sep 2022 • Huayu Chen, Cheng Lu, Chengyang Ying, Hang Su, Jun Zhu

To address this problem, we adopt a generative approach by decoupling the learned policy into two parts: an expressive generative behavior model and an action evaluation model.

Computational Efficiency D4RL +4

Paper
Code

Domain Adaptation with Adversarial Training on Penultimate Activations

1 code implementation • 26 Aug 2022 • Tao Sun, Cheng Lu, Haibin Ling

We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features, as used in previous works.

Unsupervised Domain Adaptation

Paper
Code

Local Context-Aware Active Domain Adaptation

1 code implementation • ICCV 2023 • Tao Sun, Cheng Lu, Haibin Ling

In this paper, we propose a Local context-aware ADA framework, named LADA, to address this issue.

Domain Adaptation

Paper
Code

Prior Knowledge Guided Unsupervised Domain Adaptation

1 code implementation • 18 Jul 2022 • Tao Sun, Cheng Lu, Haibin Ling

We propose a general rectification module that uses such prior knowledge to refine model generated pseudo labels.

Unsupervised Domain Adaptation

Paper
Code

CKD-TransBTS: Clinical Knowledge-Driven Hybrid Transformer with Modality-Correlated Cross-Attention for Brain Tumor Segmentation

no code implementations • 15 Jul 2022 • Jianwei Lin, Jiatai Lin, Cheng Lu, Hao Chen, Huan Lin, Bingchao Zhao, Zhenwei Shi, Bingjiang Qiu, Xipeng Pan, Zeyan Xu, Biao Huang, Changhong Liang, Guoqiang Han, Zaiyi Liu, Chu Han

To bridge the gap between Transformer and CNN features, we propose a Trans&CNN Feature Calibration block (TCFC) in the decoder.

Brain Tumor Segmentation Clinical Knowledge +3

Paper
Add Code

3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching

1 code implementation • 6 Jul 2022 • Runyu Mao, Chen Bai, Yatong An, Fengqing Zhu, Cheng Lu

To the best of our knowledge, 3DG-STFM is the first student-teacher learning method for the local feature matching task.

Homography Estimation Model Compression

Paper
Code

Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching

1 code implementation • 16 Jun 2022 • Cheng Lu, Kaiwen Zheng, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

To fill up this gap, we show that the negative likelihood of the ODE can be bounded by controlling the first, second, and third-order score matching errors; and we further present a novel high-order denoising score matching method to enable maximum likelihood training of score-based diffusion ODEs.

Denoising

Paper
Code

DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps

2 code implementations • 2 Jun 2022 • Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

In this work, we propose an exact formulation of the solution of diffusion ODEs.

1,380

Paper
Code

Situational Perception Guided Image Matting

no code implementations • 20 Apr 2022 • Bo Xu, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo

In this paper, we propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations and captures sufficient situational perception information for better global saliency distilled from the visual-to-textual task.

Image Matting Object

Paper
Add Code

Safe Self-Refinement for Transformer-based Domain Adaptation

1 code implementation • CVPR 2022 • Tao Sun, Cheng Lu, Tianshuo Zhang, Haibin Ling

Unsupervised Domain Adaptation (UDA) aims to leverage a label-rich source domain to solve tasks on a related unlabeled target domain.

Transfer Learning Unsupervised Domain Adaptation

Paper
Code

WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic Segmentation for Lung Adenocarcinoma

no code implementations • 13 Apr 2022 • Chu Han, Xipeng Pan, Lixu Yan, Huan Lin, Bingbing Li, Su Yao, Shanshan Lv, Zhenwei Shi, Jinhai Mai, Jiatai Lin, Bingchao Zhao, Zeyan Xu, Zhizhen Wang, Yumeng Wang, Yuan Zhang, Huihui Wang, Chao Zhu, Chunhui Lin, Lijian Mao, Min Wu, Luwen Duan, Jingsong Zhu, Dong Hu, Zijie Fang, Yang Chen, Yongbing Zhang, Yi Li, Yiwen Zou, Yiduo Yu, Xiaomeng Li, Haiming Li, Yanfen Cui, Guoqiang Han, Yan Xu, Jun Xu, Huihua Yang, Chunming Li, Zhenbing Liu, Cheng Lu, Xin Chen, Changhong Liang, Qingling Zhang, Zaiyi Liu

According to the technical reports of the top-tier teams, CAM is still the most popular approach in WSSS.

Data Augmentation Weakly supervised Semantic Segmentation +1

Paper
Add Code

Semantic Distillation Guided Salient Object Detection

no code implementations • 8 Mar 2022 • Bo Xu, Guanze Liu, Han Huang, Cheng Lu, Yandong Guo

Most existing CNN-based salient object detection methods can identify local segmentation details like hair and animal fur, but often misinterpret the real saliency due to the lack of global contextual information caused by the subjectiveness of the SOD task and the locality of convolution layers.

Image Captioning Object +3

Paper
Add Code

Shuffle Augmentation of Features from Unlabeled Data for Unsupervised Domain Adaptation

no code implementations • 28 Jan 2022 • Changwei Xu, Jianfei Yang, Haoran Tang, Han Zou, Cheng Lu, Tianshuo Zhang

Unsupervised Domain Adaptation (UDA), a branch of transfer learning where labels for target samples are unavailable, has been widely researched and developed in recent years with the help of adversarially trained models.

Transfer Learning Unsupervised Domain Adaptation

Paper
Add Code

Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation

no code implementations • 22 Oct 2021 • Ziwen Li, Bo Xu, Han Huang, Cheng Lu, Yandong Guo

In this paper, we propose a new framework Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation (DTS-VIBE), to generate 3D human pose and mesh from RGB videos.

Ranked #44 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation Optical Flow Estimation

Paper
Add Code

Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction

1 code implementation • ICCV 2021 • Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo

In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image.

Human-Object Interaction Detection Image Matting

Paper
Code

Implicit Normalizing Flows

1 code implementation • ICLR 2021 • Cheng Lu, Jianfei Chen, Chongxuan Li, Qiuhao Wang, Jun Zhu

Through theoretical analysis, we show that the function space of ImpFlow is strictly richer than that of ResFlows.

Paper
Code

DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions in the Wild

no code implementations • 13 Aug 2020 • Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang, Wanchuang Xia, Cheng Lu, Jiateng Liu

Experimental results show that DFEW is a well-designed and challenging database, and the proposed EC-STFL can promisingly improve the performance of existing spatiotemporal deep neural networks in coping with the problem of dynamic FER in the wild.

Ranked #17 on Dynamic Facial Expression Recognition on DFEW

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Add Code

Discriminative Multi-modality Speech Recognition

2 code implementations • CVPR 2020 • Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang

Vision is often used as a complementary modality for audio speech recognition (ASR), especially in the noisy environment where performance of solo audio modality significantly deteriorates.

Ranked #6 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)

Audio-Visual Speech Recognition Lipreading +2

Paper
Code

Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation in the Wild

no code implementations • 7 Apr 2020 • Zhecan Wang, Jian Zhao, Cheng Lu, Han Huang, Fan Yang, Lianji Li, Yandong Guo

To better demonstrate the advantage of our methods, we further propose a new benchmark dataset with the most rich distribution of head-gaze combination reflecting real-world scenarios.

Gaze Estimation

Paper
Add Code

VFlow: More Expressive Generative Flows with Variational Data Augmentation

1 code implementation • ICML 2020 • Jianfei Chen, Cheng Lu, Biqi Chenli, Jun Zhu, Tian Tian

Generative flows are promising tractable models for density modeling that define probabilistic distributions with invertible transformations.

Ranked #30 on Image Generation on CIFAR-10 (bits/dimension metric)

Density Estimation Image Generation +2

Paper
Code

Dually Supervised Feature Pyramid for Object Detection and Segmentation

1 code implementation • 8 Dec 2019 • Fan Yang, Cheng Lu, Yandong Guo, Longin Jan Latecki, Haibin Ling

Feature pyramid architecture has been broadly adopted in object detection and segmentation to deal with multi-scale problem.

Object object-detection +2

Paper
Code

Staying up to Date with Online Content Changes Using Reinforcement Learning for Scheduling

1 code implementation • NeurIPS 2019 • Andrey Kolobov, Yuval Peres, Cheng Lu, Eric J. Horvitz

From traditional Web search engines to virtual assistants and Web accelerators, services that rely on online information need to continually keep track of remote content changes by explicitly requesting content updates from remote sources (e. g., web pages).

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Model-based Iterative Restoration for Binary Document Image Compression with Dictionary Learning

no code implementations • CVPR 2017 • Yandong Guo, Cheng Lu, Jan P. Allebach, Charles A. Bouman

Experimental results with a variety of document images demonstrate that our method improves the image quality compared with the observed image, and simultaneously improves the compression ratio.

Dictionary Learning Image Compression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.