Search Results for author: Minglei Li

Found 18 papers, 7 papers with code

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

1 code implementation2 Apr 2024 Xu He, Qiaochu Huang, Zhensong Zhang, Zhiwei Lin, Zhiyong Wu, Sicheng Yang, Minglei Li, Zhiyi Chen, Songcen Xu, Xiaofei Wu

While previous works mostly generate structural human skeletons, resulting in the omission of appearance information, we focus on the direct generation of audio-driven co-speech gesture videos in this work.

Video Generation

Partial Fine-Tuning: A Successor to Full Fine-Tuning for Vision Transformers

no code implementations25 Dec 2023 Peng Ye, Yongqi Huang, Chongjun Tu, Minglei Li, Tao Chen, Tong He, Wanli Ouyang

We first validate eight manually-defined partial fine-tuning strategies across kinds of datasets and vision transformer architectures, and find that some partial fine-tuning strategies (e. g., ffn only or attention only) can achieve better performance with fewer tuned parameters than full fine-tuning, and selecting appropriate layers is critical to partial fine-tuning.

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

no code implementations23 Dec 2023 Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, Changpeng Yang, Zhou Zhao

However, talking head translation, converting audio-visual speech (i. e., talking head video) from one language into another, still confronts several challenges compared to audio speech: (1) Existing methods invariably rely on cascading, synthesizing via both audio and text, resulting in delays and cascading errors.

Self-Supervised Learning Speech-to-Speech Translation +1

Practical Deep Dispersed Watermarking with Synchronization and Fusion

1 code implementation23 Oct 2023 Hengchang Guo, Qilong Zhang, Junwei Luo, Feng Guo, Wenbin Zhang, Xiaodong Su, Minglei Li

Compared with state-of-the-art approaches, our blind watermarking can achieve better performance: averagely improve the bit accuracy by 5. 28\% and 5. 93\% against single and combined attacks, respectively, and show less file size increment and better visual quality.

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

1 code implementation26 Aug 2023 Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai

In this paper, we introduce the DiffuseStyleGesture+, our solution for the Generation and Evaluation of Non-verbal Behavior for Embodied Agents (GENEA) Challenge 2023, which aims to foster the development of realistic, automated systems for generating conversational gestures.

QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

1 code implementation CVPR 2023 Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang

Levenshtein distance based on audio quantization as a similarity metric of corresponding speech of gestures helps match more appropriate gestures with speech, and solves the alignment problem of speech and gestures well.

Gesture Generation Quantization

The ReprGesture entry to the GENEA Challenge 2022

1 code implementation25 Aug 2022 Sicheng Yang, Zhiyong Wu, Minglei Li, Mengchen Zhao, Jiuxin Lin, Liyang Chen, Weihong Bao

This paper describes the ReprGesture entry to the Generation and Evaluation of Non-verbal Behaviour for Embodied Agents (GENEA) challenge 2022.

Gesture Generation Representation Learning

A Signal Detection Scheme Based on Deep Learning in OFDM Systems

no code implementations24 Jul 2021 Guangliang Pan, Zitong Liu, Wei Wang, Minglei Li

Simulation results show that the DDLSD scheme outperforms the existing traditional methods in terms of improving channel estimation and signal detection performance.

Time Series Time Series Analysis

Fake News Detection Through Multi-Perspective Speaker Profiles

no code implementations IJCNLP 2017 Yunfei Long, Qin Lu, Rong Xiang, Minglei Li, Chu-Ren Huang

This paper proposes a novel method to incorporate speaker profiles into an attention based LSTM model for fake news detection.

Fake News Detection

A Cognition Based Attention Model for Sentiment Analysis

no code implementations EMNLP 2017 Yunfei Long, Qin Lu, Rong Xiang, Minglei Li, Chu-Ren Huang

Evaluations show the CBA based method outperforms the state-of-the-art local context based attention methods significantly.

Feature Engineering Product Recommendation +1

Emotion Corpus Construction Based on Selection from Hashtags

no code implementations LREC 2016 Minglei Li, Yunfei Long, Lu Qin, Wenjie Li

Secondly, a SVM based classifier is used to select the data whose natural labels are consistent with the predicted labels.

Emotion Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.