1 code implementation • 9 Jan 2025 • Zixuan Ke, Yifei Ming, Xuan-Phi Nguyen, Caiming Xiong, Shafiq Joty
Domain-adaptive post-training of large language models (LLMs) has emerged as a promising approach for specialized domains such as medicine and finance.
no code implementations • 20 Dec 2024 • Jiabao Qiu, Zixuan Ke, Bing Liu
We introduce CLOB, a novel continual learning (CL) paradigm wherein a large language model (LLM) is regarded as a black box.
1 code implementation • 20 Dec 2024 • Saleh Momeni, Sahisnu Mazumder, Zixuan Ke, Bing Liu
However, incrementally learning each new task in ICL necessitates adding training examples from each class of the task to the prompt, which hampers scalability as the prompt length increases.
1 code implementation • 30 Sep 2024 • Yifei Ming, Senthil Purushwalkam, Shrey Pandit, Zixuan Ke, Xuan-Phi Nguyen, Caiming Xiong, Shafiq Joty
Ensuring faithfulness to context in large language models (LLMs) and retrieval-augmented generation (RAG) systems is crucial for reliable deployment in real-world applications, as incorrect or unsupported information can erode user trust.
no code implementations • 16 Sep 2024 • Xuan-Phi Nguyen, Shrey Pandit, Senthil Purushwalkam, Austin Xu, Hailin Chen, Yifei Ming, Zixuan Ke, Silvio Savarese, Caiming Xong, Shafiq Joty
Retrieval Augmented Generation (RAG), a paradigm that integrates external contextual information with large language models (LLMs) to enhance factual accuracy and relevance, has emerged as a pivotal area in generative AI.
no code implementations • 13 Jan 2024 • Zixuan Ke, Weize Kong, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky
Large Language Models (LLMs) have demonstrated superior results across a wide range of tasks, and Retrieval-augmented Generation (RAG) is an effective way to enhance the performance by locating relevant information and placing it into the context window of the LLM.
1 code implementation • 13 Oct 2023 • Zixuan Ke, Bing Liu, Wenhan Xiong, Asli Celikyilmaz, Haoran Li
To our knowledge, only one method has been proposed to learn a sequence of mixed tasks.
1 code implementation • 26 Jun 2023 • Tatsuya Konishi, Mori Kurokawa, Chihiro Ono, Zixuan Ke, Gyuhak Kim, Bing Liu
Although several techniques have achieved learning with no CF, they attain it by letting each task monopolize a sub-network in a shared network, which seriously limits knowledge transfer (KT) and causes over-consumption of the network capacity, i. e., as more tasks are learned, the performance deteriorates.
no code implementations • 20 Apr 2023 • Gyuhak Kim, Changnan Xiao, Tatsuya Konishi, Zixuan Ke, Bing Liu
The paper then proves that the theory can be generalized or extended to open-world CIL, which is the proposed open-world continual learning, that can perform CIL in the open world and detect future or open-world OOD data.
2 code implementations • 7 Feb 2023 • Zixuan Ke, Yijia Shao, Haowei Lin, Tatsuya Konishi, Gyuhak Kim, Bing Liu
A novel proxy is also proposed to preserve the general knowledge in the original LM.
Ranked #1 on
Continual Pretraining
on ACL-ARC
2 code implementations • 21 Jan 2023 • Zixuan Ke, Yijia Shao, Haowei Lin, Hu Xu, Lei Shu, Bing Liu
This paper shows that the existing methods are suboptimal and proposes a novel method to perform a more informed adaptation of the knowledge in the LM by (1) soft-masking the attention heads based on their importance to best preserve the general knowledge in the LM and (2) contrasting the representations of the general and the full (both general and domain knowledge) to learn an integrated representation with both general and domain-specific knowledge.
1 code implementation • 23 Nov 2022 • Zixuan Ke, Bing Liu
Continual learning (CL) is a learning paradigm that emulates the human capability of learning and accumulating knowledge continually without forgetting the previously learned knowledge and also transferring the learned knowledge to help learn new tasks better.
1 code implementation • 4 Nov 2022 • Gyuhak Kim, Changnan Xiao, Tatsuya Konishi, Zixuan Ke, Bing Liu
Continual learning (CL) learns a sequence of tasks incrementally.
3 code implementations • 11 Oct 2022 • Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, Bing Liu
Recent work on applying large language models (LMs) achieves impressive performance in many NLP applications.
Ranked #1 on
Continual Pretraining
on AG News
3 code implementations • 20 Aug 2022 • Gyuhak Kim, Zixuan Ke, Bing Liu
Instead of using the saved samples in memory to update the network for previous tasks/classes in the existing approach, MORE leverages the saved samples to build a task specific classifier (adding a new classification head) without updating the network learned for previous tasks/classes.
no code implementations • WASSA (ACL) 2022 • Zixuan Ke, Mohammad Kachuee, Sungjin Lee
In many real-world machine learning applications, samples belong to a set of domains e. g., for product reviews each review belongs to a product category.
2 code implementations • 18 Dec 2021 • Zixuan Ke, Bing Liu, Hao Wang, Lei Shu
In this setting, the CL system learns a sequence of SC tasks incrementally in a neural network, where each task builds a classifier to classify the sentiment of reviews of a particular product category or domain.
Ranked #4 on
Continual Learning
on DSC (10 tasks)
2 code implementations • NeurIPS 2020 • Zixuan Ke, Bing Liu, Xingchang Huang
To the best of our knowledge, no technique has been proposed to learn a sequence of mixed similar and dissimilar tasks that can deal with forgetting and also transfer knowledge forward and backward.
Ranked #1 on
Continual Learning
on F-CelebA (10 tasks)
1 code implementation • NAACL 2021 • Zixuan Ke, Hu Xu, Bing Liu
This paper studies continual learning (CL) of a sequence of aspect sentiment classification (ASC) tasks.
Ranked #3 on
Continual Learning
on ASC (19 tasks)
1 code implementation • NeurIPS 2021 • Zixuan Ke, Bing Liu, Nianzu Ma, Hu Xu, Lei Shu
Although several papers have tried to deal with both CF and KT, our experiments show that they suffer from serious CF when the tasks do not have much shared knowledge.
Ranked #1 on
Continual Learning
on DSC (10 tasks)
1 code implementation • EMNLP 2021 • Zixuan Ke, Bing Liu, Hu Xu, Lei Shu
The key novelty is a contrastive continual learning method that enables both knowledge transfer across tasks and knowledge distillation from old tasks to the new task, which eliminates the need for task ids in testing.
no code implementations • 29 Sep 2021 • Tatsuya Konishi, Mori Kurokawa, Roberto Legaspi, Chihiro Ono, Zixuan Ke, Gyuhak Kim, Bing Liu
The goal of this work is to endow such systems with the additional ability to transfer knowledge among tasks when the tasks are similar and have shared knowledge to achieve higher accuracy.
no code implementations • 29 Sep 2021 • Gyuhak Kim, Sepideh Esmaeilpour, Zixuan Ke, Tatsuya Konishi, Bing Liu
PLS is not only simple and efficient but also does not invade data privacy due to the fact that it works in the latent feature space.
no code implementations • ACL 2019 • Zixuan Ke, Hrishikesh Inamdar, Hui Lin, Vincent Ng
While the vast majority of existing work on automated essay scoring has focused on holistic scoring, researchers have recently begun work on scoring specific dimensions of essay quality.
no code implementations • ACL 2018 • Winston Carlile, Nishant Gurrapadi, Zixuan Ke, Vincent Ng
While argument persuasiveness is one of the most important dimensions of argumentative essay quality, it is relatively little studied in automated essay scoring research.