no code implementations • 24 Dec 2024 • Wen Wen, Qiang Zhou, Yu Xi, Haoyu Li, Ziqi Gong, Kai Yu
In multi-speaker scenarios, leveraging spatial features is essential for enhancing target speech.
no code implementations • 17 Dec 2024 • Yu Xi, Haoyu Li, Hao Li, Jiaqi Guo, Xu Li, Wen Ding, Kai Yu
In recent years, there has been a growing interest in designing small-footprint yet effective Connectionist Temporal Classification based keyword spotting (CTC-KWS) systems.
no code implementations • 17 Dec 2024 • Yu Xi, Haoyu Li, Xiaoyu Gu, Hao Li, Yidi Jiang, Kai Yu
Connectionist Temporal Classification (CTC), a non-autoregressive training criterion, is widely used in online keyword spotting (KWS).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 11 Dec 2024 • Zitong Chen, Chao Sun, Shida Nie, Chen Min, Changjiu Ning, Haoyu Li, Bo wang
Off-road environments present significant challenges for autonomous ground vehicles due to the absence of structured roads and the presence of complex obstacles, such as uneven terrain, vegetation, and occlusions.
no code implementations • 3 Dec 2024 • Derek Xu, Tong Xie, Botao Xia, Haoyu Li, Yunsheng Bai, Yizhou Sun, Wei Wang
This work focuses on the few-shot examples present in most code generation prompts, offering a systematic study on whether few-shot examples improve LLM's coding capabilities, which few-shot examples have the largest impact, and how to select impactful examples.
2 code implementations • 3 Aug 2024 • Yuan YAO, Tianyu Yu, Ao Zhang, Chongyi Wang, Junbo Cui, Hongji Zhu, Tianchi Cai, Haoyu Li, Weilin Zhao, Zhihui He, Qianyu Chen, Huarong Zhou, Zhensheng Zou, Haoye Zhang, Shengding Hu, Zhi Zheng, Jie zhou, Jie Cai, Xu Han, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun
The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of AI research and industry, shedding light on a promising path toward the next AI milestone.
Ranked #7 on Multiple-choice on Neptune-Full
1 code implementation • 18 Jun 2024 • Haoyu Li, Baochen Yang, Yu Xi, Linfeng Yu, Tian Tan, Hao Li, Kai Yu
TPDT-SS shows remarkable success in addressing permutation problems in mixed keyword speech, thereby greatly boosting the performance of the backend.
no code implementations • 20 Mar 2024 • Yu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu
Designing an efficient keyword spotting (KWS) system that delivers exceptional performance on resource-constrained edge devices has long been a subject of significant attention.
no code implementations • 21 Feb 2024 • Haoyu Li, Hao Wu, Badong Chen
Reconstructing visual stimuli from functional Magnetic Resonance Imaging fMRI enables fine-grained retrieval of brain activity.
no code implementations • 21 Feb 2024 • Haoyu Li, Han-Wei Shen
In this paper, we present an improved technique for range analysis by revisiting the arithmetic rules and analyzing the probability distribution of the network output within a spatial region.
1 code implementation • 12 Feb 2024 • Haoyu Li, Yuchen Xu, Jiayi Chen, Rohit Dwivedula, Wenfei Wu, Keqiang He, Aditya Akella, Daehyeok Kim
As deep neural networks (DNNs) grow in complexity and size, the resultant increase in communication overhead during distributed training has become a significant bottleneck, challenging the scalability of distributed training systems.
no code implementations • 7 Feb 2024 • Mengqi Chen, Bin Guo, Hao Wang, Haoyu Li, Qian Zhao, Jingqi Liu, Yasan Ding, Yan Pan, Zhiwen Yu
To depict the research trends of CogAgent, in this paper, we first present several fundamental cognitive psychology theories and give the formalized definition of three typical cognitive strategies, including the persuasion strategy, the topic path planning strategy, and the argument structure prediction strategy.
1 code implementation • 17 Jan 2024 • Tong Xie, Haoyu Li, Andrew Bai, Cho-Jui Hsieh
Data attribution methods trace model behavior back to its training dataset, offering an effective approach to better understand ''black-box'' neural networks.
1 code implementation • 8 Dec 2023 • Haoyu Li, Shichang Zhang, Longwen Tang, Mathieu Bauchy, Yizhou Sun
Metallic Glasses (MGs) are widely used materials that are stronger than steel while being shapeable as plastic.
1 code implementation • 26 Oct 2022 • Haoyu Li, Xuan Wang, Tong Liu, Dingyi Fang, Baoying Liu
In this paper, we propose the use of a Hidden Markov Model (HMM) for the reconstruction of convolutional codes and decoding by the Viterbi algorithm.
no code implementations • 5 Aug 2022 • Jingyi Shen, Haoyu Li, Jiayi Xu, Ayan Biswas, Han-Wei Shen
We qualitatively and quantitatively evaluate the effectiveness and efficiency of latent representations generated by our method with data from multiple scientific visualization applications.
1 code implementation • 25 Jul 2022 • Neng Shi, Jiayi Xu, Haoyu Li, Hanqi Guo, Jonathan Woodring, Han-Wei Shen
In the model inference stage, we predict the latent representations at previously selected viewpoints and decode the latent representations to data space.
no code implementations • 22 Mar 2022 • Haoyu Li, Yun Liu, Junichi Yamagishi
Speech enhancement (SE) methods mainly focus on recovering clean speech from noisy input.
no code implementations • 16 Sep 2021 • Haoyu Li, Junichi Yamagishi
A large and growing amount of speech content in real-life scenarios is being recorded on consumer-grade devices in uncontrolled environments, resulting in degraded speech quality.
1 code implementation • 27 May 2021 • Haoyu Li, Han-Wei Shen
Feature related particle data analysis plays an important role in many scientific applications such as fluid simulations, cosmology simulations and molecular dynamics.
1 code implementation • 17 Apr 2021 • Haoyu Li, Junichi Yamagishi
The intelligibility of speech severely degrades in the presence of environmental noise and reverberation.
no code implementations • 8 Apr 2020 • Haoyu Li, Junichi Yamagishi
In recent years, speech enhancement (SE) has achieved impressive progress with the success of deep neural networks (DNNs).
Audio and Speech Processing
1 code implementation • Interspeech 2020 • Haoyu Li, Szu-Wei Fu, Yu Tsao, Junichi Yamagishi
The intelligibility of natural speech is seriously degraded when exposed to adverse noisy environments.
Audio and Speech Processing Sound
no code implementations • 19 Apr 2019 • Subhashis Hazarika, Haoyu Li, Ko-Chih Wang, Han-Wei Shen, Ching-Shan Chou
We utilize the trained network to perform interactive parameter sensitivity analysis of the original simulation at multiple levels-of-detail as well as recommend optimal parameter configurations using the activation maximization framework of neural networks.
2 code implementations • 19 Sep 2018 • Weisong Zhao, Jian Liu, Chenqi Kong, Yixuan Zhao, Changliang Guo, Chen-Guang Liu, Xiangyan Ding, Xumin Ding, Jiubin Tan, Haoyu Li
Despite super-resolution fluorescence blinking microscopes break the diffraction limit, the intense phototoxic illumination and long-term image sequences thus far still pose to major challenges in visualizing live-organisms.
Optics
no code implementations • 26 Oct 2016 • Haoyu Li, Changliang Guo, Inbarasan Muniraj, Bryce C. Schroeder, John T. Sheridan, Shu Jia
We report a light-field based method that allows the optical encryption of three-dimensional (3D) volumetric information at the microscopic scale in a single 2D light-field image.