no code implementations • 18 Oct 2024 • Zhiyuan Peng, Jinming Nian, Alexandre Evfimievski, Yi Fang
Conversational AI agents use Retrieval Augmented Generation (RAG) to provide verifiable document-grounded responses to user inquiries.
1 code implementation • 25 Sep 2024 • Ethan Lin, Zhiyuan Peng, Yi Fang
Recent studies have evaluated the creativity/novelty of large language models (LLMs) primarily from a semantic perspective, using benchmarks from cognitive science.
no code implementations • 14 Sep 2024 • Hao Ma, Zhiyuan Peng, Xu Li, Yukai Li, Mingjie Shao, Qiuqiang Kong, Ju Liu
In a vanilla parallel-data-free training stage, target audio is encoded using the pre-trained CLAP audio encoder to form a condition embedding, while during testing, user language queries are encoded by CLAP text encoder as the condition embedding.
1 code implementation • 15 Aug 2024 • Jinming Nian, Zhiyuan Peng, Qifan Wang, Yi Fang
In knowledge-intensive tasks such as open-domain question answering (OpenQA), Large Language Models (LLMs) often struggle to generate factual answers relying solely on their internal (parametric) knowledge.
no code implementations • 6 Jul 2024 • Zhiyuan Peng, Yuanbo Tang, Yang Li
DNA sequences encode vital genetic and biological information, yet these unfixed-length sequences cannot serve as the input of common data mining algorithms.
no code implementations • 29 Jun 2024 • Zifan Zhang, Yuchen Liu, Zhiyuan Peng, Mingzhe Chen, Dongkuan Xu, Shuguang Cui
To bridge this gap, we introduce a novel digital twin-assisted optimization framework, called D-REC, which integrates reinforcement learning (RL) with diverse intervention modules to ensure reliable caching in nextG wireless networks.
1 code implementation • 31 May 2024 • Xuyang Wu, Zhiyuan Peng, Krishna Sravanthi Rajanala Sai, Hsin-Tai Wu, Yi Fang
In this paper, we propose passage-specific prompt tuning for reranking in open-domain question answering (PSPT): a parameter-efficient method that fine-tunes learnable passage-specific soft prompts, incorporating passage-specific knowledge from a limited set of question-passage relevance pairs.
no code implementations • 6 Apr 2024 • Zhiyuan Peng, Xuyang Wu, Qifan Wang, Sravanthi Rajanala, Yi Fang
Parameter Efficient Fine-Tuning (PEFT) methods have been extensively utilized in Large Language Models (LLMs) to improve the down-streaming tasks without the cost of fine-tuing the whole LLMs.
no code implementations • 29 Feb 2024 • Xukun Liu, Zhiyuan Peng, Xiaoyuan Yi, Xing Xie, Lirong Xiang, Yuchen Liu, Dongkuan Xu
While achieving remarkable progress in a broad range of tasks, large language models (LLMs) remain significantly limited in properly using massive external tools.
1 code implementation • 27 Feb 2024 • Hao Ma, Zhiyuan Peng, Xu Li, Mingjie Shao, Xixin Wu, Ju Liu
Universal sound separation (USS) aims to extract arbitrary types of sounds from real-world recordings.
Ranked #1 on
Target Sound Extraction
on AudioSet
1 code implementation • 9 Feb 2024 • Zihan Dong, Xinyu Fan, Zhiyuan Peng
Financial market predictions utilize historical data to anticipate future stock prices and market trends.
no code implementations • 13 Dec 2023 • Yuanbo Tang, Zhiyuan Peng, Yang Li
A hierarchical dictionary learning scheme is also proposed to ensure the algorithm's scalability on large networks, leading to a multi-scale trajectory representation.
1 code implementation • 13 Dec 2023 • Hao Ma, Zhiyuan Peng, Mingjie Shao, Jing Li, Ju Liu
Target-speaker automatic speech recognition (ASR) aims to transcribe the desired speech of a target speaker from multi-talker overlapped utterances.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 21 Sep 2023 • Wei Liu, Ying Qin, Zhiyuan Peng, Tan Lee
Child speech, as a representative type of low-resource speech, is leveraged for adaptation.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • 21 Sep 2023 • Wei Liu, Zhiyuan Peng, Tan Lee
The search process is carried out in two steps: (1) coarse search: to determine top $K$ candidates by pruning the most redundant layers based on the correlation matrix; (2) fine search: to select the best pruning proposal among $K$ candidates using a task-specific evaluation metric.
1 code implementation • 8 Aug 2023 • Binfeng Xu, Xukun Liu, Hua Shen, Zeyu Han, Yuhan Li, Murong Yue, Zhiyuan Peng, Yuchen Liu, Ziyu Yao, Dongkuan Xu
We present gentopia, an ALM framework enabling flexible customization of agents through simple configurations, seamlessly integrating various language models, task formats, prompting modules, and plugins into a unified paradigm.
1 code implementation • 17 Jul 2023 • Zhiyuan Peng, Xuyang Wu, Qifan Wang, Yi Fang
We design a filter to select high-quality example document-query pairs in the prompt to further improve the quality of weak tagged queries.
2 code implementations • 23 May 2023 • Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, Dongkuan Xu
Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution.
no code implementations • 21 Mar 2022 • Xiaoxiao Shang, Zhiyuan Peng, Qiming Yuan, Sabiq Khan, Lauren Xie, Yi Fang, Subramaniam Vincent
Professional news media organizations have always touted the importance that they give to multiple perspectives.
no code implementations • 4 Oct 2021 • Ying Qin, Wei Liu, Zhiyuan Peng, Si-Ioi Ng, Jingyu Li, Haibo Hu, Tan Lee
Input to these classifiers are speech transcripts produced by automatic speech recognition (ASR) models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 30 Oct 2019 • Zhiyuan Peng, Siyuan Feng, Tan Lee
The USM experiments on ZeroSpeech 2017 dataset verify that the frame tokenizer is able to capture linguistic content and the utterance embedder can acquire speaker-related information.
no code implementations • 17 Jun 2019 • Siyuan Feng, Tan Lee, Zhiyuan Peng
Experimental results on ZeroSpeech 2017 show that both approaches are effective while the latter is more prominent, and that their combination brings further marginal improvement in across-speaker condition.