1 code implementation • 29 Nov 2024 • Kaican Li, Weiyan Xie, Yongxiang Huang, Didan Deng, Lanqing Hong, Zhenguo Li, Ricardo Silva, Nevin L. Zhang
Fine-tuning foundation models often compromises their robustness to distribution shifts.
no code implementations • 18 Oct 2024 • Jialin Yu, Yuxiang Zhou, Yulan He, Nevin L. Zhang, Ricardo Silva
By using domain-specific supervised data, the general-purpose representation derived from PLMs can be transformed into a domain-specific representation.
2 code implementations • 13 Oct 2024 • Rui Min, Zeyu Qin, Nevin L. Zhang, Li Shen, Minhao Cheng
We find that current safety purification methods are vulnerable to the rapid re-learning of backdoor behavior, even when further fine-tuning of purified models is performed using a very small number of poisoned samples.
no code implementations • 3 Oct 2024 • Stefan Juang, Hugh Cao, Arielle Zhou, Ruochen Liu, Nevin L. Zhang, Elvis Liu
This paper introduces Comparative Advantage Maximization (CAM), a method designed to enhance individual agent specialization in multiagent systems.
no code implementations • 15 Jul 2024 • Xingzhi Zhou, Xin Dong, Chunhao Li, Yuning Bai, Yulong Xu, Ka Chun Cheung, Simon See, Xinpeng Song, Runshun Zhang, Xuezhong Zhou, Nevin L. Zhang
However, this task faces limitations due to the scarcity of high-quality clinical datasets and the complex relationship between symptoms and herbs.
no code implementations • 26 Jan 2024 • Xingzhi Zhou, Zhiliang Tian, Ka Chun Cheung, Simon See, Nevin L. Zhang
Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference.
no code implementations • 10 Oct 2023 • Kaican Li, Yifan Zhang, Lanqing Hong, Zhenguo Li, Nevin L. Zhang
This indicates that while pre-trained representations may help improve downstream in-distribution performance, they could have minimal or even adverse effects on generalization in certain OOD scenarios of the downstream task if not used properly.
1 code implementation • 10 Aug 2023 • Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li, Nevin L. Zhang
Training large language models (LLMs) with open-domain instruction data has yielded remarkable success in aligning to end tasks and human preferences.
no code implementations • 13 Jul 2023 • Nevin L. Zhang, Kaican Li, Han Gao, Weiyan Xie, Zhi Lin, Zhenguo Li, Luning Wang, Yongxiang Huang
Domain generalization (DG) is about learning models that generalize well to new domains that are related to, but different from, the training domain(s).
1 code implementation • 10 Jun 2023 • Weiyan Xie, Xiao-Hui Li, Zhi Lin, Leonard K. M. Poon, Caleb Chen Cao, Nevin L. Zhang
The need to explain the output of a deep neural network classifier is now widely recognized.
1 code implementation • 18 May 2023 • Yingxiu Zhao, Bowen Yu, Haiyang Yu, Bowen Li, Jinyang Li, Chao Wang, Fei Huang, Yongbin Li, Nevin L. Zhang
To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.
1 code implementation • 13 May 2023 • Han Gao, Kaican Li, Weiyan Xie, Zhi Lin, Yongxiang Huang, Luning Wang, Caleb Chen Cao, Nevin L. Zhang
In this paper, we consider a third, lesser-known setting where a training domain is endowed with a collection of pairs of examples that share the same semantic information.
1 code implementation • 23 Nov 2022 • Yingxiu Zhao, Yinhe Zheng, Bowen Yu, Zhiliang Tian, Dongkyu Lee, Jian Sun, Haiyang Yu, Yongbin Li, Nevin L. Zhang
In this paper, we explore a novel setting, semi-supervised lifelong language learning (SSLL), where a model learns sequentially arriving language tasks with both labeled and unlabeled data.
1 code implementation • 6 Nov 2022 • Weiyan Xie, Xiao-Hui Li, Caleb Chen Cao, Nevin L. Zhang
Despite the popularity of Vision Transformers (ViTs) and eXplainable AI (XAI), only a few explanation methods have been designed specially for ViTs thus far.
no code implementations • 22 Oct 2022 • Dongkyu Lee, Zhiliang Tian, Yingxiu Zhao, Ka Chun Cheung, Nevin L. Zhang
The question is answered in our work with the concept of model calibration; we view a teacher model not only as a source of knowledge but also as a gauge to detect miscalibration of a student.
no code implementations • 22 Oct 2022 • Dongkyu Lee, Ka Chun Cheung, Nevin L. Zhang
Furthermore, inspired by recent work in bridging label smoothing and knowledge distillation, our work utilizes self-knowledge as a prior label distribution in softening target labels, and presents theoretical support for the regularization effect by knowledge distillation and the dynamic smoothing parameter.
1 code implementation • 14 Oct 2022 • Yingxiu Zhao, Yinhe Zheng, Zhiliang Tian, Chang Gao, Bowen Yu, Haiyang Yu, Yongbin Li, Jian Sun, Nevin L. Zhang
Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD) systems.
no code implementations • 27 Jul 2022 • Xingzhi Zhou, Nevin L. Zhang
A deep clustering model conceptually consists of a feature extractor that maps data points to a latent space, and a clustering head that groups data points into clusters in the latent space.
no code implementations • ACL 2022 • Yingxiu Zhao, Zhiliang Tian, Huaxiu Yao, Yinhe Zheng, Dongkyu Lee, Yiping Song, Jian Sun, Nevin L. Zhang
Building models of natural language processing (NLP) is challenging in low-resource scenarios where only limited data are available.
1 code implementation • 16 Mar 2022 • Nevin L. Zhang, Weiyan Xie, Zhi Lin, Guanfang Dong, Xiao-Hui Li, Caleb Chen Cao, Yunpeng Wang
Some examples are easier for humans to classify than others.
1 code implementation • ACL 2021 • Dongkyu Lee, Zhiliang Tian, Lanqing Xue, Nevin L. Zhang
A common approach is to map a given sentence to content representation that is free of style, and the content representation is fed to a decoder with a target style.
1 code implementation • ACL 2021 • Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu
In this paper, we develop DeepRapper, a Transformer-based rap generation system that can model both rhymes and rhythms.
no code implementations • 21 May 2021 • Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, Nevin L. Zhang
The task requires models to generate personalized responses for a speaker given a few conversations from the speaker and a social network.
no code implementations • 10 Jul 2020 • Leonard K. M. Poon, Nevin L. Zhang, Haoran Xie, Gary Cheng
Topic modeling has been one of the most active research areas in machine learning in recent years.
no code implementations • ACL 2020 • Zhiliang Tian, Wei Bi, Dongkyu Lee, Lanqing Xue, Yiping Song, Xiaojiang Liu, Nevin L. Zhang
In previous work, the external document is utilized by (1) creating a context-aware document memory that integrates information from the document and the conversational context, and then (2) generating responses referring to the memory.
1 code implementation • 1 Dec 2019 • Lanqing Xue, Xiaopeng Li, Nevin L. Zhang
Attention mechanisms compute input-dependent dynamic attention weights for aggregating a sequence of hidden states.
1 code implementation • ACL 2019 • Zhiliang Tian, Wei Bi, Xiaopeng Li, Nevin L. Zhang
In this work, we propose a memory-augmented generative model, which learns to abstract from the training corpus and saves the useful information to the memory to assist the response generation.
no code implementations • 17 May 2019 • Farhan Khawar, Nevin L. Zhang
In this paper, we analyze the spectral properties of the Pearson and the cosine similarity estimators, and we use tools from random matrix theory to argue that they suffer from noise and eigenvalues spreading.
no code implementations • 28 Aug 2018 • Farhan Khawar, Nevin L. Zhang
We then use insights from random matrix theory (RMT) to show that picking the top eigenvectors corresponds to removing sampling noise from user/item co-occurrence matrices.
no code implementations • 28 Aug 2018 • Farhan Khawar, Nevin L. Zhang
In this paper, we propose as a novel method for addressing the lack of negative examples in implicit feedback.
1 code implementation • 6 Jun 2018 • Farhan Khawar, Nevin L. Zhang
Categories created in this fashion are based on users' co-consumption of items.
no code implementations • 16 Mar 2018 • Zhourong Chen, Xiaopeng Li, Nevin L. Zhang
An important characteristic of FNN structures learned this way is that they are sparse.
no code implementations • ICLR 2019 • Xiaopeng Li, Zhourong Chen, Leonard K. M. Poon, Nevin L. Zhang
We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features.
no code implementations • 14 Mar 2018 • Xiaopeng Li, Zhourong Chen, Nevin L. Zhang
We use Chow-Liu's algorithm to learn a tree-structured probabilistic model for the units at the current level, use the tree to identify subsets of units that are strongly correlated, and introduce a new unit with receptive field over the subsets.
no code implementations • ICLR 2018 • Zhourong Chen, Xiaopeng Li, Nevin L. Zhang
Convolutional neural networks and recurrent neural networks are designed with network structures well suited to the nature of spacial and sequential data respectively.
no code implementations • 12 Dec 2017 • Peixian Chen, Zhourong Chen, Nevin L. Zhang
It is the new state-of-the-art for the hierarchical topic detection.
1 code implementation • 6 Apr 2017 • Farhan Khawar, Nevin L. Zhang
Implicit feedback is the simplest form of user feedback that can be used for item recommendation.
no code implementations • 1 Oct 2016 • Nevin L. Zhang, Leonard K. M. Poon
Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables.
1 code implementation • 29 Sep 2016 • Leonard K. M. Poon, Nevin L. Zhang
The resulting topic model contains a hierarchy of topics so that users can browse the topics from the top level to the bottom level.
no code implementations • 17 Sep 2016 • Zhourong Chen, Nevin L. Zhang, Dit-yan Yeung, Peixian Chen
We are interested in exploring the possibility and benefits of structure learning for deep models.
1 code implementation • 21 May 2016 • Peixian Chen, Nevin L. Zhang, Tengfei Liu, Leonard K. M. Poon, Zhourong Chen, Farhan Khawar
The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below.
no code implementations • 26 Jan 2016 • Chen Fu, Nevin L. Zhang, Bao Xin Chen, Zhou Rong Chen, Xiang Lan Jin, Rong Juan Guo, Zhi Gang Chen, Yun Ling Zhang
Conclusions: A solution for the TCM syndrome classification problem associated with VMCI is established based on the latent tree analysis of unlabeled symptom survey data.
no code implementations • 5 Aug 2015 • Peixian Chen, Nevin L. Zhang, Leonard K. M. Poon, Zhourong Chen
It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies.
no code implementations • CVPR 2015 • Peixian Chen, Naiyan Wang, Nevin L. Zhang, Dit-yan Yeung
Low-rank matrix factorization has long been recognized as a fundamental problem in many computer vision applications.
no code implementations • 27 Oct 2014 • Nevin L. Zhang, Chen Fu, Teng Fei Liu, Bao Xin Chen, Kin Man Poon, Pei Xian Chen, Yun Ling Zhang
Conclusions: A data-driven method for TCM syndrome identification and classification is presented.
no code implementations • 4 Feb 2014 • Raphaël Mourad, Christine Sinoquet, Nevin L. Zhang, Tengfei Liu, Philippe Leray
In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences.
no code implementations • 15 Jan 2014 • Yi Wang, Nevin L. Zhang, Tao Chen
We propose a novel method for approximate inference in Bayesian networks (BNs).