no code implementations • 24 Mar 2024 • Hui Lu, Hu Jian, Ronald Poppe, Albert Ali Salah
The FTP framework adds four feature processors that focus on specific aspects of human action in videos: action category, action components, action description, and context information.
no code implementations • 19 Mar 2024 • Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong
There have been emerging research interest and advances in speech-to-speech translation (S2ST), translating utterances from one language to another.
no code implementations • 19 Mar 2024 • Yifan Peng, Ilia Kulikov, Yilin Yang, Sravya Popuri, Hui Lu, Changhan Wang, Hongyu Gong
Speech language models (LMs) are promising for high-quality speech synthesis through in-context learning.
1 code implementation • 18 Mar 2024 • Hui Lu, Albert Ali Salah, Ronald Poppe
A key challenge in continuous sign language recognition (CSLR) is to efficiently capture long-range spatial interactions over time from the video input.
Ranked #3 on Sign Language Recognition on CSL-Daily
no code implementations • 29 Dec 2023 • Shaojie Zhu, Zhaobin Wang, Chengxiang Zhuo, Hui Lu, Bo Hu, Zang Li
CoT (Chain-of-Thought) is a way to solve reasoning problems for LLMs .
1 code implementation • 11 Dec 2023 • Hui Lu, Albert Ali Salah, Ronald Poppe
We argue that the denoising process is crucially limited by an accumulation of the reconstruction error due to an initial inaccurate reconstruction of the target data.
Ranked #16 on Image Generation on CIFAR-10
1 code implementation • 5 Dec 2023 • Zhengmao Ye, Dengchun Li, Jingqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang
Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains.
no code implementations • 11 Nov 2023 • Hanzhang Zhou, Junlang Qian, Zijian Feng, Hui Lu, Zixiao Zhu, Kezhi Mao
In this study, we investigate in-context learning (ICL) in document-level event argument extraction (EAE) to alleviate the dependency on large-scale labeled data for this task.
no code implementations • 7 Aug 2023 • Renjie Liang, Yiming Yang, Hui Lu, Li Li
To tackle this problem, we propose a novel efficient multi-teacher model (EMTM) based on knowledge distillation to transfer diverse knowledge from both heterogeneous and isomorphic networks.
no code implementations • 2 Dec 2022 • Hui Lu, Mia Chiquier, Carl Vondrick
We introduce a framework for navigating through cluttered environments by connecting multiple cameras together while simultaneously preserving privacy.
1 code implementation • 27 Oct 2022 • Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng
Moreover, we optimize the training strategy by leveraging more audio to learn MSMCRs better for low-resource languages.
no code implementations • 25 Oct 2022 • Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng
We propose an unsupervised learning method to disentangle speech into content representation and speaker identity representation.
no code implementations • 18 Feb 2022 • Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng
The primary task of ASA fine-tunes the SE with the speech of the target dysarthric speaker to effectively capture identity-related information, and the secondary task applies adversarial training to avoid the incorporation of abnormal speaking patterns into the reconstructed speech, by regularizing the distribution of reconstructed speech to be close to that of reference speech with high quality.
no code implementations • 29 Sep 2021 • Xin Zhang, Yanhua Li, Ziming Zhang, Christopher Brinton, Zhenming Liu, Zhi-Li Zhang, Hui Lu, Zhihong Tian
State-of-the-art imitation learning (IL) approaches, e. g, GAIL, apply adversarial training to minimize the discrepancy between expert and learner behaviors, which is prone to unstable training and mode collapse.
2 code implementations • 19 Jul 2021 • Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng
This argument motivates the current work that presents a novel, channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a channel-wise gating mechanism in the connection between feature groups.