Search Results for author: Hui Lu

Found 15 papers, 5 papers with code

Enhancing Video Transformers for Action Understanding with VLM-aided Training

no code implementations • 24 Mar 2024 • Hui Lu, Hu Jian, Ronald Poppe, Albert Ali Salah

The FTP framework adds four feature processors that focus on specific aspects of human action in videos: action category, action components, action description, and context information.

Action Understanding

Paper
Add Code

MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation

no code implementations • 19 Mar 2024 • Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong

There have been emerging research interest and advances in speech-to-speech translation (S2ST), translating utterances from one language to another.

Language Modelling Speech-to-Speech Translation +1

Paper
Add Code

An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis

no code implementations • 19 Mar 2024 • Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong

Speech language models (LMs) are promising for high-quality speech synthesis through in-context learning.

In-Context Learning Speech Synthesis

Paper
Add Code

TCNet: Continuous Sign Language Recognition from Trajectories and Correlated Regions

1 code implementation • 18 Mar 2024 • Hui Lu, Albert Ali Salah, Ronald Poppe

A key challenge in continuous sign language recognition (CSLR) is to efficiently capture long-range spatial interactions over time from the video input.

Ranked #3 on Sign Language Recognition on CSL-Daily

Sign Language Recognition

Paper
Code

Olapa-MCoT: Enhancing the Chinese Mathematical Reasoning Capability of LLMs

no code implementations • 29 Dec 2023 • Shaojie Zhu, Zhaobin Wang, Chengxiang Zhuo, Hui Lu, Bo Hu, Zang Li

CoT (Chain-of-Thought) is a way to solve reasoning problems for LLMs .

Mathematical Reasoning

Paper
Add Code

Compensation Sampling for Improved Convergence in Diffusion Models

1 code implementation • 11 Dec 2023 • Hui Lu, Albert Ali Salah, Ronald Poppe

We argue that the denoising process is crucially limited by an accumulation of the reconstruction error due to an initial inaccurate reconstruction of the target data.

Ranked #16 on Image Generation on CIFAR-10

Denoising Facial Inpainting

Paper
Code

ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU

1 code implementation • 5 Dec 2023 • Zhengmao Ye, Dengchun Li, Jingqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang

Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains.

Large Language Model Scheduling

178

Paper
Code

Heuristic-Driven Link-of-Analogy Prompting: Enhancing Large Language Models for Document-Level Event Argument Extraction

no code implementations • 11 Nov 2023 • Hanzhang Zhou, Junlang Qian, Zijian Feng, Hui Lu, Zixiao Zhu, Kezhi Mao

In this study, we investigate in-context learning (ICL) in document-level event argument extraction (EAE) to alleviate the dependency on large-scale labeled data for this task.

Event Argument Extraction In-Context Learning +2

Paper
Add Code

Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation

no code implementations • 7 Aug 2023 • Renjie Liang, Yiming Yang, Hui Lu, Li Li

To tackle this problem, we propose a novel efficient multi-teacher model (EMTM) based on knowledge distillation to transfer diverse knowledge from both heterogeneous and isomorphic networks.

Knowledge Distillation Sentence +1

Paper
Add Code

Private Multiparty Perception for Navigation

no code implementations • 2 Dec 2022 • Hui Lu, Mia Chiquier, Carl Vondrick

We introduce a framework for navigating through cluttered environments by connecting multiple cameras together while simultaneously preserving privacy.

Paper
Add Code

Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations

1 code implementation • 27 Oct 2022 • Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng

Moreover, we optimize the training strategy by leveraging more audio to learn MSMCRs better for low-resource languages.

Transfer Learning

156

Paper
Code

Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE

no code implementations • 25 Oct 2022 • Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng

We propose an unsupervised learning method to disentangle speech into content representation and speaker identity representation.

Disentanglement Voice Conversion

Paper
Add Code

Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation

no code implementations • 18 Feb 2022 • Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng

The primary task of ASA fine-tunes the SE with the speech of the target dysarthric speaker to effectively capture identity-related information, and the secondary task applies adversarial training to avoid the incorporation of abnormal speaking patterns into the reconstructed speech, by regularizing the distribution of reconstructed speech to be close to that of reference speech with high quality.

Multi-Task Learning Speaker Verification

Paper
Add Code

Stabilized Likelihood-based Imitation Learning via Denoising Continuous Normalizing Flow

no code implementations • 29 Sep 2021 • Xin Zhang, Yanhua Li, Ziming Zhang, Christopher Brinton, Zhenming Liu, Zhi-Li Zhang, Hui Lu, Zhihong Tian

State-of-the-art imitation learning (IL) approaches, e. g, GAIL, apply adversarial training to minimize the discrepancy between expert and learner behaviors, which is prone to unstable training and mode collapse.

Denoising Imitation Learning

Paper
Add Code

Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks

2 code implementations • 19 Jul 2021 • Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng

This argument motivates the current work that presents a novel, channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a channel-wise gating mechanism in the connection between feature groups.

Speaker Verification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.