Search Results for author: Ruizhe Li

Found 23 papers, 16 papers with code

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

1 code implementation10 Feb 2024 Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

Leveraging the rich linguistic knowledge and strong reasoning abilities of LLMs, our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.

Machine Translation Translation

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

no code implementations8 Feb 2024 Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, EnSiong Chng, Chao-Han Huck Yang

Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output.

Audio-Visual Speech Recognition Automatic Speech Recognition +3

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

1 code implementation19 Jan 2024 Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng

To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction

no code implementations28 Aug 2023 Yanjun Gao, Ruizhe Li, John Caskey, Dmitriy Dligach, Timothy Miller, Matthew M. Churpek, Majid Afshar

In this paper, we outline an innovative approach for augmenting the proficiency of LLMs in the realm of automated diagnosis generation, achieved through the incorporation of a medical knowledge graph (KG) and a novel graph model: Dr. Knows, inspired by the clinical diagnostic reasoning process.

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

1 code implementation16 Jul 2023 Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process.

Denoising Multi-Task Learning +2

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

1 code implementation18 Jun 2023 Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng

In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a., unsupervised noise adaptation.

Audio-Visual Speech Recognition speech-recognition +1

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

1 code implementation16 May 2023 Yuchen Hu, Ruizhe Li, Chen Chen, Heqing Zou, Qiushi Zhu, Eng Siong Chng

However, most existing AVSR approaches simply fuse the audio and visual features by concatenation, without explicit interactions to capture the deep correlations between them, which results in sub-optimal multimodal representations for downstream speech recognition task.

Audio-Visual Speech Recognition Automatic Speech Recognition +3

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

1 code implementation22 Feb 2023 Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Motion-related Artefact Classification Using Patch-based Ensemble and Transfer Learning in Cardiac MRI

1 code implementation14 Oct 2022 Ruizhe Li, Xin Chen

The final trained model was also evaluated on an independent test set by the CMRxMotion organisers, which achieved the classification accuracy of 72. 5% and Cohen's Kappa of 0. 6309 (ranked top 1 in this grand challenge).

Transfer Learning

On the Latent Holes of VAEs for Text Generation

no code implementations7 Oct 2021 Ruizhe Li, Xutan Peng, Chenghua Lin

In this paper, we provide the first focused study on the discontinuities (aka.

Text Generation

On the Latent Holes 🧀 of VAEs for Text Generation

no code implementations29 Sep 2021 Ruizhe Li, Xutan Peng, Chenghua Lin

In this paper, we provide the first focused study on the discontinuities (aka.

Text Generation

Affective Decoding for Empathetic Response Generation

1 code implementation INLG (ACL) 2021 Chengkun Zeng, Guanyi Chen, Chenghua Lin, Ruizhe Li, Zhigang Chen

Understanding speaker's feelings and producing appropriate responses with emotion connection is a key communicative skill for empathetic dialogue systems.

Empathetic Response Generation Response Generation

A generic ensemble based deep convolutional neural network for semi-supervised medical image segmentation

1 code implementation16 Apr 2020 Ruizhe Li, Dorothee Auer, Christian Wagner, Xin Chen

To address this problem, we propose a generic semi-supervised learning framework for image segmentation based on a deep convolutional neural network (DCNN).

Image Segmentation Lesion Segmentation +5

A Stable Variational Autoencoder for Text Modelling

1 code implementation WS 2019 Ruizhe Li, Xiao Li, Chenghua Lin, Matthew Collinson, Rui Mao

Variational Autoencoder (VAE) is a powerful method for learning representations of high-dimensional data.

Latent Space Factorisation and Manipulation via Matrix Subspace Projection

2 code implementations ICML 2020 Xiao Li, Chenghua Lin, Ruizhe Li, Chaozheng Wang, Frank Guerin

We demonstrate the utility of our method for attribute manipulation in autoencoders trained across varied domains, using both human evaluation and automated methods.

Ranked #7 on Image Generation on CelebA 256x256 (FID metric)

Attribute Face Generation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.