Search Results for author: Sicheng Yang

Found 11 papers, 6 papers with code

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

1 code implementation • 2 Apr 2024 • Xu He, Qiaochu Huang, Zhensong Zhang, Zhiwei Lin, Zhiyong Wu, Sicheng Yang, Minglei Li, Zhiyi Chen, Songcen Xu, Xiaofei Wu

While previous works mostly generate structural human skeletons, resulting in the omission of appearance information, we focus on the direct generation of audio-driven co-speech gesture videos in this work.

Video Generation

Paper
Code

MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

no code implementations • 14 Mar 2024 • Zunnan Xu, Yukang Lin, Haonan Han, Sicheng Yang, Ronghui Li, Yachao Zhang, Xiu Li

Gesture synthesis is a vital realm of human-computer interaction, with wide-ranging applications across various fields like film, robotics, and virtual reality.

Paper
Add Code

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

no code implementations • 7 Jan 2024 • Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu

To tackle these issues, we introduce FreeTalker, which, to the best of our knowledge, is the first framework for the generation of both spontaneous (e. g., co-speech gesture) and non-spontaneous (e. g., moving around the podium) speaker motions.

Gesture Generation

Paper
Add Code

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

no code implementations • 26 Dec 2023 • Zunnan Xu, Yachao Zhang, Sicheng Yang, Ronghui Li, Xiu Li

We introduce a novel method that separates priors from speech and employs multimodal priors as constraints for generating gestures.

Gesture Generation

Paper
Add Code

UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons

1 code implementation • 13 Sep 2023 • Sicheng Yang, Zilin Wang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Qiaochu Huang, Lei Hao, Songcen Xu, Xiaofei Wu, Changpeng Yang, Zonghong Dai

The automatic co-speech gesture generation draws much attention in computer animation.

Gesture Generation

Paper
Code

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

1 code implementation • 26 Aug 2023 • Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai

In this paper, we introduce the DiffuseStyleGesture+, our solution for the Generation and Evaluation of Non-verbal Behavior for Embodied Agents (GENEA) Challenge 2023, which aims to foster the development of realistic, automated systems for generating conversational gestures.

120

Paper
Code

QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

1 code implementation • CVPR 2023 • Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang

Levenshtein distance based on audio quantization as a similarity metric of corresponding speech of gestures helps match more appropriate gestures with speech, and solves the alignment problem of speech and gestures well.

Gesture Generation Quantization

Paper
Code

The ReprGesture entry to the GENEA Challenge 2022

1 code implementation • 25 Aug 2022 • Sicheng Yang, Zhiyong Wu, Minglei Li, Mengchen Zhao, Jiuxin Lin, Liyang Chen, Weihong Bao

This paper describes the ReprGesture entry to the Generation and Evaluation of Non-verbal Behaviour for Embodied Agents (GENEA) challenge 2022.

Gesture Generation Representation Learning

Paper
Code

Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion

1 code implementation • 18 Aug 2022 • Sicheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng

One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic.

Disentanglement Voice Conversion

103

Paper
Code

Multilevel orthogonal Bochner function subspaces with applications to robust machine learning

no code implementations • 4 Oct 2021 • Julio Enrique Castrillon-Candas, Dingning Liu, Sicheng Yang, Mark Kon

To uncover the separation between these classes, we employ the Karhunen-Loeve expansion and construct the appropriate subspaces.

Anomaly Detection BIG-bench Machine Learning

Paper
Add Code

KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning

no code implementations • 13 Dec 2020 • Dandan song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao

To develop machine with cognition-level visual understanding and reasoning abilities, the visual commonsense reasoning (VCR) task has been introduced.

Ranked #4 on Visual Question Answering (VQA) on VCR (Q-AR) test

Sentence Visual Commonsense Reasoning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.