no code implementations • 15 Jan 2025 • Yuxuan Hu, Jing Zhang, Xiaodong Chen, Zhe Zhao, Cuiping Li, Hong Chen
Existing low-rank adaptation (LoRA) methods face challenges on sparse large language models (LLMs) due to the inability to maintain sparsity.
1 code implementation • 20 Dec 2024 • Wenxi Chen, Ziyang Ma, Ruiqi Yan, Yuzhe Liang, Xiquan Li, Ruiyang Xu, Zhikang Niu, Yanqiao Zhu, Yifan Yang, Zhanxun Liu, Kai Yu, Yuxuan Hu, Jinyu Li, Yan Lu, Shujie Liu, Xie Chen
Recent advancements highlight the potential of end-to-end real-time spoken dialogue systems, showcasing their low latency and high quality.
no code implementations • 20 Dec 2024 • Yifan Yang, Ziyang Ma, Shujie Liu, Jinyu Li, Hui Wang, Lingwei Meng, Haiyang Sun, Yuzhe Liang, Ruiyang Xu, Yuxuan Hu, Yan Lu, Rui Zhao, Xie Chen
This paper introduces Interleaved Speech-Text Language Model (IST-LM) for streaming zero-shot Text-to-Speech (TTS).
no code implementations • 2 Dec 2024 • Ruchao Fan, Bo Ren, Yuxuan Hu, Rui Zhao, Shujie Liu, Jinyu Li
The AlignFormer can achieve a near 100% IFR with audio-first training and game-changing improvements from zero to non-zero IFR on some evaluation data with instruction-first training.
1 code implementation • 16 Nov 2024 • Yuxuan Hu, Ke Wang, Xiaokang Zhang, Fanjin Zhang, Cuiping Li, Hong Chen, Jing Zhang
Speculative decoding (SD) has been demonstrated as an effective technique for lossless LLM inference acceleration.
no code implementations • 15 Nov 2024 • Xiaodong Chen, Yuxuan Hu, Xiaokang Zhang, Yanling Wang, Cuiping Li, Hong Chen, Jing Zhang
Pruning has become a widely adopted technique for reducing the hardware requirements of large language models (LLMs).
no code implementations • 20 Sep 2024 • Yuxuan Hu, Chenwei Zhang, Min Yang, Xiaodan Liang, Chengming Li, Xiping Hu
In this paper, we study the multi-source Domain Generalization of text classification and propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
1 code implementation • 23 Jul 2024 • Yuxuan Hu, Minghuan Tan, Chenwei Zhang, Zixuan Li, Xiaodan Liang, Min Yang, Chengming Li, Xiping Hu
By incorporating emotional support strategies, we aim to enrich the model's capabilities in both cognitive and affective empathy, leading to a more nuanced and comprehensive empathetic response.
no code implementations • 7 Apr 2024 • Mengyan Wang, Yuxuan Hu, Shiqing Wu, Weihua Li, Quan Bai, Verica Rupar
While preference-based recommendation algorithms effectively enhance user engagement by recommending personalized content, they often result in the creation of ``filter bubbles''.
1 code implementation • 28 Mar 2024 • Xiaodong Chen, Yuxuan Hu, Jing Zhang, Yanling Wang, Cuiping Li, Hong Chen
This paper introduces LLM-Streamline, a pioneer work on layer pruning for large language models (LLMs).
2 code implementations • CVPR 2024 • Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu, Hongsheng Li
On the other hand, previous autonomous driving methods tend to rely on limited-format inputs (e. g. sensor data and navigation waypoints), restricting the vehicle's ability to understand language information and interact with humans.
1 code implementation • 31 Aug 2023 • Yuxuan Hu, Jing Zhang, Zhe Zhao, Chen Zhao, Xiaodong Chen, Cuiping Li, Hong Chen
Structured pruning is a widely used technique for reducing the size of pre-trained language models (PLMs), but current methods often overlook the potential of compressing the hidden dimension (d) in PLMs, a dimension critical to model size and efficiency.
no code implementations • 10 Aug 2023 • Lianli Gao, Xinyu Lyu, Yuyu Guo, Yuxuan Hu, Yuan-Fang Li, Lu Xu, Heng Tao Shen, Jingkuan Song
It integrates two components: Semantic Debiasing (SD) and Balanced Predicate Learning (BPL), for these imbalances.
1 code implementation • 17 Jul 2023 • Shaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong, Ed Lin, Michael Zeng
Most end-to-end (E2E) speech recognition models are composed of encoder and decoder blocks that perform acoustic and language modeling functions.
no code implementations • 6 Jul 2023 • Mengyan Wang, Yuxuan Hu, Zihan Yuan, Chenting Jiang, Weihua Li, Shiqing Wu, Quan Bai
This approach endeavors to transcend the constraints of the filter bubble, enrich recommendation diversity, and strike a belief balance among users while also catering to user preferences and system-specific business requirements.
no code implementations • 22 Mar 2023 • Yuxuan Hu, Albert Lui, Mark Goldstein, Mukund Sudarshan, Andrea Tinsay, Cindy Tsui, Samuel Maidman, John Medamana, Neil Jethani, Aahlad Puli, Vuthy Nguy, Yindalon Aphinyanaphongs, Nicholas Kiefer, Nathaniel Smilowitz, James Horowitz, Tania Ahuja, Glenn I Fishman, Judith Hochman, Stuart Katz, Samuel Bernard, Rajesh Ranganath
We developed a deep learning-based risk stratification tool, called CShock, for patients admitted into the cardiac ICU with acute decompensated heart failure and/or myocardial infarction to predict onset of cardiogenic shock.
no code implementations • 20 Mar 2023 • Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng
Code-switching speech refers to a means of expression by mixing two or more languages within a single utterance.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 1 Mar 2023 • Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong
We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring language identification (LID) input from users during inference.
no code implementations • 22 Jul 2022 • Yuxuan Hu, Zhaoyang Xia, Yanbo Zhao, Feng Xu
The generalization for different scenarios and dif-ferent users is an urgent problem for millimeter wave gesture recognition for indoor fiber-to-the-room (FTTR) scenario.
1 code implementation • ICCV 2021 • Yuyu Guo, Lianli Gao, Xuanhan Wang, Yuxuan Hu, Xing Xu, Xu Lu, Heng Tao Shen, Jingkuan Song
The scene graph generation (SGG) task aims to detect visual relationship triplets, i. e., subject, predicate, object, in an image, providing a structural vision layout for scene understanding.
no code implementations • 14 Apr 2021 • Weihua Li, Yuxuan Hu, Shiqing Wu, Quan Bai, Edmund Lai
A key step in influence maximization in online social networks is the identification of a small number of users, known as influencers, who are able to spread influence quickly and widely to other users.
1 code implementation • 22 Oct 2019 • Dongwei Jiang, Xiaoning Lei, Wubo Li, Ne Luo, Yuxuan Hu, Wei Zou, Xiangang Li
Speech recognition technologies are gaining enormous popularity in various industrial applications.
no code implementations • 13 May 2019 • Xintian Han, Yuxuan Hu, Luca Foschini, Larry Chinitz, Lior Jankelson, Rajesh Ranganath
For this model, we utilized a new technique to generate smoothed examples to produce signals that are 1) indistinguishable to cardiologists from the original examples and 2) incorrectly classified by the neural network.