Search Results for author: Yi Yuan

Found 36 papers, 13 papers with code

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

no code implementations30 Apr 2024 Haohe Liu, Xuenan Xu, Yi Yuan, Mengyue Wu, Wenwu Wang, Mark D. Plumbley

Large language models (LLMs) have significantly advanced audio processing through audio codecs that convert audio into discrete tokens, enabling the application of language modelling techniques to audio data.

Decoder Language Modelling

T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining

no code implementations27 Apr 2024 Yi Yuan, Zhuo Chen, Xubo Liu, Haohe Liu, Xuenan Xu, Dongya Jia, Yuanzhe Chen, Mark D. Plumbley, Wenwu Wang

Contrastive language-audio pretraining~(CLAP) has been developed to align the representations of audio and language, achieving remarkable performance in retrieval and classification tasks.


HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback

no code implementations13 Mar 2024 Ang Li, Qiugen Xiao, Peng Cao, Jian Tang, Yi Yuan, Zijie Zhao, Xiaoyuan Chen, Liang Zhang, Xiangyang Li, Kaitong Yang, Weidong Guo, Yukang Gan, Xu Yu, Daniell Wang, Ying Shan

Using ChatGPT as a labeler to provide feedback on open-domain prompts in RLAIF training, we observe an increase in human evaluators' preference win ratio for model responses, but a decrease in evaluators' satisfaction rate.

Language Modelling Large Language Model +2

Novel 3D Geometry-Based Stochastic Models for Non-Isotropic MIMO Vehicle-to-Vehicle Channels

no code implementations1 Dec 2023 Yi Yuan, Cheng-Xiang Wang, Xiang Cheng, Bo Ai, David I. Laurenson

Moreover, a novel parameter computation method is proposed for jointly calculating the azimuth and elevation angles in the SoS channel simulator.


High-Quality 3D Face Reconstruction with Affine Convolutional Networks

no code implementations22 Oct 2023 Zhiqian Lin, Jiangke Lin, Lincheng Li, Yi Yuan, Zhengxia Zou

In our method, an affine transformation matrix is learned from the affine convolution layer for each spatial location of the feature maps.

3D Face Reconstruction Decoder

Retrieval-Augmented Text-to-Audio Generation

no code implementations14 Sep 2023 Yi Yuan, Haohe Liu, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang

Despite recent progress in text-to-audio (TTA) generation, we show that the state-of-the-art models, such as AudioLDM, trained on datasets with an imbalanced class distribution, such as AudioCaps, are biased in their generation performance.

AudioCaps Audio Generation +2

Separate Anything You Describe

1 code implementation9 Aug 2023 Xubo Liu, Qiuqiang Kong, Yan Zhao, Haohe Liu, Yi Yuan, Yuzhuo Liu, Rui Xia, Yuxuan Wang, Mark D. Plumbley, Wenwu Wang

In this work, we introduce AudioSep, a foundation model for open-domain audio source separation with natural language queries.

Audio Source Separation Natural Language Queries +2

WavJourney: Compositional Audio Creation with Large Language Models

1 code implementation26 Jul 2023 Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang

Subjective evaluations demonstrate the potential of WavJourney in crafting engaging storytelling audio content from text.

Audio Generation

Text-Driven Foley Sound Generation With Latent Diffusion Model

1 code implementation17 Jun 2023 Yi Yuan, Haohe Liu, Xubo Liu, Xiyuan Kang, Peipei Wu, Mark D. Plumbley, Wenwu Wang

We have observed that the feature embedding extracted by the text encoder can significantly affect the performance of the generation model.

Transfer Learning

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

3 code implementations29 Jan 2023 Haohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo Mandic, Wenwu Wang, Mark D. Plumbley

By learning the latent representations of audio signals and their compositions without modeling the cross-modal relationship, AudioLDM is advantageous in both generation quality and computational efficiency.

AudioCaps Audio Generation +2

SwiftAvatar: Efficient Auto-Creation of Parameterized Stylized Character on Arbitrary Avatar Engines

no code implementations19 Jan 2023 Shizun Wang, Weihong Zeng, Xu Wang, Hao Yang, Li Chen, Yi Yuan, Yunzhao Zeng, Min Zheng, Chuang Zhang, Ming Wu

To this end, we propose SwiftAvatar, a novel avatar auto-creation framework that is evidently superior to previous works.

Learning Implicit Body Representations from Double Diffusion Based Neural Radiance Fields

no code implementations23 Dec 2021 Guangming Yao, Hongzhi Wu, Yi Yuan, Lincheng Li, Kun Zhou, Xin Yu

In this paper, we present a novel double diffusion based neural radiance field, dubbed DD-NeRF, to reconstruct human body geometry and render the human body appearance in novel views from a sparse set of images.

Novel View Synthesis

ZiGAN: Fine-grained Chinese Calligraphy Font Generation via a Few-shot Style Transfer Approach

no code implementations8 Aug 2021 Qi Wen, Shuang Li, Bingfeng Han, Yi Yuan

Chinese character style transfer is a very challenging problem because of the complexity of the glyph shapes or underlying structures and large numbers of existed characters, when comparing with English letters.

Font Generation Style Transfer

TransVOS: Video Object Segmentation with Transformers

1 code implementation1 Jun 2021 Jianbiao Mei, Mengmeng Wang, Yeneng Lin, Yi Yuan, Yong liu

Recently, Space-Time Memory Network (STM) based methods have achieved state-of-the-art performance in semi-supervised video object segmentation (VOS).

Object One-shot visual object segmentation +3

Single-Shot Motion Completion with Transformer

1 code implementation1 Mar 2021 Yinglin Duan, Tianyang Shi, Zhengxia Zou, Yenan Lin, Zhehui Qian, Bohan Zhang, Yi Yuan

Motion completion is a challenging and long-discussed problem, which is of great significance in film and game applications.

Motion Synthesis

One-shot Face Reenactment Using Appearance Adaptive Normalization

no code implementations8 Feb 2021 Guangming Yao, Yi Yuan, Tianjia Shao, Shuang Li, Shanqi Liu, Yong liu, Mengmeng Wang, Kun Zhou

The paper proposes a novel generative adversarial network for one-shot face reenactment, which can animate a single face image to a different pose-and-expression (provided by a driving image) while keeping its original appearance.

Face Reenactment Generative Adversarial Network

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

no code implementations8 Feb 2021 Lijuan Liu, Yin Yang, Yi Yuan, Tianjia Shao, He Wang, Kun Zhou

In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex.

Graph Generation Relation

Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation

no code implementations5 Feb 2021 Jilin Tang, Yi Yuan, Tianjia Shao, Yong liu, Mengmeng Wang, Kun Zhou

In this paper we tackle the problem of pose guided person image generation, which aims to transfer a person image from the source pose to a novel target pose while maintaining the source appearance.

Image Generation

MeInGame: Create a Game Character Face from a Single Portrait

1 code implementation4 Feb 2021 Jiangke Lin, Yi Yuan, Zhengxia Zou

To tackle these problems, we propose 1) a low-cost facial texture acquisition method, 2) a shape transfer algorithm that can transform the shape of a 3DMM mesh to games, and 3) a new pipeline for training 3D game face reconstruction networks.

3D Face Reconstruction Face Model

RFNet: Recurrent Forward Network for Dense Point Cloud Completion

no code implementations ICCV 2021 Tianxin Huang, Hao Zou, Jinhao Cui, Xuemeng Yang, Mengmeng Wang, Xiangrui Zhao, Jiangning Zhang, Yi Yuan, Yifan Xu, Yong liu

The RFE extracts multiple global features from the incomplete point clouds for different recurrent levels, and the FDC generates point clouds in a coarse-to-fine pipeline.

Point Cloud Completion

NeuralMagicEye: Learning to See and Understand the Scene Behind an Autostereogram

1 code implementation31 Dec 2020 Zhengxia Zou, Tianyang Shi, Yi Yuan, Zhenwei Shi

This paper studies an interesting question that whether a deep CNN can be trained to recover the depth behind an autostereogram and understand its content.

HR-Depth: High Resolution Self-Supervised Monocular Depth Estimation

1 code implementation14 Dec 2020 Xiaoyang Lyu, Liang Liu, Mengmeng Wang, Xin Kong, Lina Liu, Yong liu, Xinxin Chen, Yi Yuan

To obtainmore accurate depth estimation in large gradient regions, itis necessary to obtain high-resolution features with spatialand semantic information.

Monocular Depth Estimation Self-Supervised Learning +2

Stylized Neural Painting

4 code implementations CVPR 2021 Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, Zhenwei Shi

Different from previous image-to-image translation methods that formulate the translation as pixel-wise prediction, we deal with such an artistic creation process in a vectorized environment and produce a sequence of physically meaningful stroke parameters that can be further used for rendering.

Disentanglement Image-to-Image Translation +2

Dynamic Future Net: Diversified Human Motion Generation

no code implementations25 Aug 2020 Wenheng Chen, He Wang, Yi Yuan, Tianjia Shao, Kun Zhou

We evaluate our model on a wide range of motions and compare it with the state-of-the-art methods.

Mesh Guided One-shot Face Reenactment using Graph Convolutional Networks

no code implementations18 Aug 2020 Guangming Yao, Yi Yuan, Tianjia Shao, Kun Zhou

In this paper, we introduce a method for one-shot face reenactment, which uses the reconstructed 3D meshes (i. e., the source mesh and driving mesh) as guidance to learn the optical flow needed for the reenacted face synthesis.

Decoder Face Generation +3

Neutral Face Game Character Auto-Creation via PokerFace-GAN

1 code implementation17 Aug 2020 Tianyang Shi, Zhengxia Zou, Xinhui Song, Zheng Song, Changjian Gu, Changjie Fan, Yi Yuan

Besides, the neural network based renderer used in previous methods is also difficult to be extended to multi-view rendering cases.

Self-Supervised Learning

Fast and Robust Face-to-Parameter Translation for Game Character Auto-Creation

no code implementations17 Aug 2020 Tianyang Shi, Zhengxia Zou, Yi Yuan, Changjie Fan

With the rapid development of Role-Playing Games (RPGs), players are now allowed to edit the facial appearance of their in-game characters with their preferences rather than using default templates.

3D Face Reconstruction Face Verification +3

Unsupervised Facial Action Unit Intensity Estimation via Differentiable Optimization

no code implementations13 Apr 2020 Xinhui Song, Tianyang Shi, Tianjia Shao, Yi Yuan, Zunlei Feng, Changjie Fan

The generator learns to "render" a face image from a set of facial parameters in a differentiable way, and the feature extractor extracts deep features for measuring the similarity of the rendered image and input real image.

Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks

3 code implementations CVPR 2020 Jiangke Lin, Yi Yuan, Tianjia Shao, Kun Zhou

In this paper, we introduce a method to reconstruct 3D facial shapes with high-fidelity textures from single-view images in-the-wild, without the need to capture a large-scale face texture database.

3D Face Reconstruction

Face-to-Parameter Translation for Game Character Auto-Creation

no code implementations ICCV 2019 Tianyang Shi, Yi Yuan, Changjie Fan, Zhengxia Zou, Zhenwei Shi, Yong liu

Character customization system is an important component in Role-Playing Games (RPGs), where players are allowed to edit the facial appearance of their in-game characters with their own preferences rather than using default templates.

Style Transfer Translation

Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks

no code implementations27 May 2019 Guanzhong Tian, Yi Yuan, Yong liu

We propose an end to end deep learning approach for generating real-time facial animation from just audio.

Cannot find the paper you are looking for? You can Submit a new open access paper.