Search Results for author: Nevin L. Zhang

Found 42 papers, 15 papers with code

Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank

no code implementations26 Jan 2024 Xingzhi Zhou, Zhiliang Tian, Ka Chun Cheung, Simon See, Nevin L. Zhang

Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference.

Test-time Adaptation

Robustness May be More Brittle than We Think under Different Degrees of Distribution Shifts

no code implementations10 Oct 2023 Kaican Li, Yifan Zhang, Lanqing Hong, Zhenguo Li, Nevin L. Zhang

This indicates that while pre-trained representations may help improve downstream in-distribution performance, they could have minimal or even adverse effects on generalization in certain OOD scenarios of the downstream task if not used properly.

A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment

1 code implementation10 Aug 2023 Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li, Nevin L. Zhang

Training large language models (LLMs) with open-domain instruction data has yielded remarkable success in aligning to end tasks and human preferences.

A Causal Framework to Unify Common Domain Generalization Approaches

no code implementations13 Jul 2023 Nevin L. Zhang, Kaican Li, Han Gao, Weiyan Xie, Zhi Lin, Zhenguo Li, Luning Wang, Yongxiang Huang

Domain generalization (DG) is about learning models that generalize well to new domains that are related to, but different from, the training domain(s).

Domain Generalization

Causal Document-Grounded Dialogue Pre-training

1 code implementation18 May 2023 Yingxiu Zhao, Bowen Yu, Haiyang Yu, Bowen Li, Jinyang Li, Chao Wang, Fei Huang, Yongbin Li, Nevin L. Zhang

To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.

Semi-Supervised Lifelong Language Learning

1 code implementation23 Nov 2022 Yingxiu Zhao, Yinhe Zheng, Bowen Yu, Zhiliang Tian, Dongkyu Lee, Jian Sun, Haiyang Yu, Yongbin Li, Nevin L. Zhang

In this paper, we explore a novel setting, semi-supervised lifelong language learning (SSLL), where a model learns sequentially arriving language tasks with both labeled and unlabeled data.

Transfer Learning

ViT-CX: Causal Explanation of Vision Transformers

1 code implementation6 Nov 2022 Weiyan Xie, Xiao-Hui Li, Caleb Chen Cao, Nevin L. Zhang

Despite the popularity of Vision Transformers (ViTs) and eXplainable AI (XAI), only a few explanation methods have been designed specially for ViTs thus far.

Explainable Artificial Intelligence (XAI)

Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation

no code implementations22 Oct 2022 Dongkyu Lee, Ka Chun Cheung, Nevin L. Zhang

Furthermore, inspired by recent work in bridging label smoothing and knowledge distillation, our work utilizes self-knowledge as a prior label distribution in softening target labels, and presents theoretical support for the regularization effect by knowledge distillation and the dynamic smoothing parameter.

Knowledge Distillation Text Generation

Hard Gate Knowledge Distillation -- Leverage Calibration for Robust and Reliable Language Model

no code implementations22 Oct 2022 Dongkyu Lee, Zhiliang Tian, Yingxiu Zhao, Ka Chun Cheung, Nevin L. Zhang

The question is answered in our work with the concept of model calibration; we view a teacher model not only as a source of knowledge but also as a gauge to detect miscalibration of a student.

Knowledge Distillation Language Modelling +2

Deep Clustering with Features from Self-Supervised Pretraining

no code implementations27 Jul 2022 Xingzhi Zhou, Nevin L. Zhang

A deep clustering model conceptually consists of a feature extractor that maps data points to a latent space, and a clustering head that groups data points into clusters in the latent space.

Clustering Deep Clustering +1

Example Perplexity

1 code implementation16 Mar 2022 Nevin L. Zhang, Weiyan Xie, Zhi Lin, Guanfang Dong, Xiao-Hui Li, Caleb Chen Cao, Yunpeng Wang

Some examples are easier for humans to classify than others.

Enhancing Content Preservation in Text Style Transfer Using Reverse Attention and Conditional Layer Normalization

1 code implementation ACL 2021 Dongkyu Lee, Zhiliang Tian, Lanqing Xue, Nevin L. Zhang

A common approach is to map a given sentence to content representation that is free of style, and the content representation is fed to a decoder with a target style.

Sentence Style Transfer +1

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling

1 code implementation ACL 2021 Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu

In this paper, we develop DeepRapper, a Transformer-based rap generation system that can model both rhymes and rhythms.

Language Modelling

Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

no code implementations21 May 2021 Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, Nevin L. Zhang

The task requires models to generate personalized responses for a speaker given a few conversations from the speaker and a social network.

Meta-Learning

Handling Collocations in Hierarchical Latent Tree Analysis for Topic Modeling

no code implementations10 Jul 2020 Leonard K. M. Poon, Nevin L. Zhang, Haoran Xie, Gary Cheng

Topic modeling has been one of the most active research areas in machine learning in recent years.

Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

no code implementations ACL 2020 Zhiliang Tian, Wei Bi, Dongkyu Lee, Lanqing Xue, Yiping Song, Xiaojiang Liu, Nevin L. Zhang

In previous work, the external document is utilized by (1) creating a context-aware document memory that integrates information from the document and the conversational context, and then (2) generating responses referring to the memory.

Informativeness Response Generation

Not All Attention Is Needed: Gated Attention Network for Sequence Data

1 code implementation1 Dec 2019 Lanqing Xue, Xiaopeng Li, Nevin L. Zhang

Attention mechanisms compute input-dependent dynamic attention weights for aggregating a sequence of hidden states.

Sentence text-classification +1

Learning to Abstract for Memory-augmented Conversational Response Generation

1 code implementation ACL 2019 Zhiliang Tian, Wei Bi, Xiaopeng Li, Nevin L. Zhang

In this work, we propose a memory-augmented generative model, which learns to abstract from the training corpus and saves the useful information to the memory to assist the response generation.

Conversational Response Generation Informativeness +2

Cleaned Similarity for Better Memory-Based Recommenders

no code implementations17 May 2019 Farhan Khawar, Nevin L. Zhang

In this paper, we analyze the spectral properties of the Pearson and the cosine similarity estimators, and we use tools from random matrix theory to argue that they suffer from noise and eigenvalues spreading.

Collaborative Filtering

Matrix Factorization Equals Efficient Co-occurrence Representation

no code implementations28 Aug 2018 Farhan Khawar, Nevin L. Zhang

We then use insights from random matrix theory (RMT) to show that picking the top eigenvectors corresponds to removing sampling noise from user/item co-occurrence matrices.

Using Taste Groups for Collaborative Filtering

no code implementations28 Aug 2018 Farhan Khawar, Nevin L. Zhang

In this paper, we propose as a novel method for addressing the lack of negative examples in implicit feedback.

Collaborative Filtering

Learning Sparse Deep Feedforward Networks via Tree Skeleton Expansion

no code implementations16 Mar 2018 Zhourong Chen, Xiaopeng Li, Nevin L. Zhang

An important characteristic of FNN structures learned this way is that they are sparse.

Building Sparse Deep Feedforward Networks using Tree Receptive Fields

no code implementations14 Mar 2018 Xiaopeng Li, Zhourong Chen, Nevin L. Zhang

We use Chow-Liu's algorithm to learn a tree-structured probabilistic model for the units at the current level, use the tree to identify subsets of units that are strongly correlated, and introduce a new unit with receptive field over the subsets.

Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering

no code implementations ICLR 2019 Xiaopeng Li, Zhourong Chen, Leonard K. M. Poon, Nevin L. Zhang

We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features.

Clustering

Learning Parsimonious Deep Feed-forward Networks

no code implementations ICLR 2018 Zhourong Chen, Xiaopeng Li, Nevin L. Zhang

Convolutional neural networks and recurrent neural networks are designed with network structures well suited to the nature of spacial and sequential data respectively.

Conformative Filtering for Implicit Feedback Data

1 code implementation6 Apr 2017 Farhan Khawar, Nevin L. Zhang

Implicit feedback is the simplest form of user feedback that can be used for item recommendation.

Clustering

Latent Tree Analysis

no code implementations1 Oct 2016 Nevin L. Zhang, Leonard K. M. Poon

Latent tree analysis seeks to model the correlations among a set of random variables using a tree of latent variables.

BIG-bench Machine Learning

Topic Browsing for Research Papers with Hierarchical Latent Tree Analysis

1 code implementation29 Sep 2016 Leonard K. M. Poon, Nevin L. Zhang

The resulting topic model contains a hierarchy of topics so that users can browse the topics from the top level to the bottom level.

Sparse Boltzmann Machines with Structure Learning as Applied to Text Analysis

no code implementations17 Sep 2016 Zhourong Chen, Nevin L. Zhang, Dit-yan Yeung, Peixian Chen

We are interested in exploring the possibility and benefits of structure learning for deep models.

Latent Tree Models for Hierarchical Topic Detection

1 code implementation21 May 2016 Peixian Chen, Nevin L. Zhang, Tengfei Liu, Leonard K. M. Poon, Zhourong Chen, Farhan Khawar

The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below.

Clustering Topic Models

Progressive EM for Latent Tree Models and Hierarchical Topic Detection

no code implementations5 Aug 2015 Peixian Chen, Nevin L. Zhang, Leonard K. M. Poon, Zhourong Chen

It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies.

Bayesian Adaptive Matrix Factorization With Automatic Model Selection

no code implementations CVPR 2015 Peixian Chen, Naiyan Wang, Nevin L. Zhang, Dit-yan Yeung

Low-rank matrix factorization has long been recognized as a fundamental problem in many computer vision applications.

Model Selection

A Survey on Latent Tree Models and Applications

no code implementations4 Feb 2014 Raphaël Mourad, Christine Sinoquet, Nevin L. Zhang, Tengfei Liu, Philippe Leray

In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences.

Clustering

Latent Tree Models and Approximate Inference in Bayesian Networks

no code implementations15 Jan 2014 Yi Wang, Nevin L. Zhang, Tao Chen

We propose a novel method for approximate inference in Bayesian networks (BNs).

Cannot find the paper you are looking for? You can Submit a new open access paper.