Search Results for author: Tianyi Zhou

Found 109 papers, 54 papers with code

Unmixing Incoherent Structures of Big Data by Randomized or Greedy Decomposition

no code implementations • 2 Sep 2013 • Tianyi Zhou, DaCheng Tao

Learning big data by matrix decomposition always suffers from expensive computation, mixing of complicated structures and noise.

Paper
Add Code

Divide-and-Conquer Learning by Anchoring a Conical Hull

no code implementations • NeurIPS 2014 • Tianyi Zhou, Jeff Bilmes, Carlos Guestrin

We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.

Clustering

Paper
Add Code

Scaling Submodular Maximization via Pruned Submodularity Graphs

no code implementations • 1 Jun 2016 • Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin

We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.

Video Summarization

Paper
Add Code

Stream Clipper: Scalable Submodular Maximization on Stream

no code implementations • 1 Jun 2016 • Tianyi Zhou, Jeff Bilmes

We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice.

Video Summarization

Paper
Add Code

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

3 code implementations • 14 Sep 2017 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, Chengqi Zhang

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively.

Ranked #68 on Natural Language Inference on SNLI

Natural Language Inference Sentence +2

313

Paper
Code

Minimax Curriculum Learning: Machine Teaching with Desirable Difficulties and Scheduled Diversity

no code implementations • ICLR 2018 • Tianyi Zhou, Jeff Bilmes

We introduce and study minimax curriculum learning (MCL), a new method for adaptively selecting a sequence of training subsets for a succession of stages in machine learning.

Clustering

Paper
Add Code

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

1 code implementation • 31 Jan 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang

In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other.

Ranked #56 on Natural Language Inference on SNLI

Hard Attention Natural Language Inference +1

313

Paper
Code

Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling

1 code implementation • ICLR 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

In this paper, we propose a model, called "bi-directional block self-attention network (Bi-BloSAN)", for RNN/CNN-free sequence encoding.

124

Paper
Code

Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together

2 code implementations • NAACL 2019 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

Neural networks equipped with self-attention have parallelizable computation, light-weight structure, and the ability to capture both long-range and local dependencies.

313

Paper
Code

Diverse Ensemble Evolution: Curriculum Data-Model Marriage

no code implementations • NeurIPS 2018 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes

We study a new method (``Diverse Ensemble Evolution (DivE$^2$)'') to train an ensemble of machine learning models that assigns data to models at each training epoch based on each model's current expertise and an intra- and inter-model diversity reward.

Paper
Add Code

Bias Also Matters: Bias Attribution for Deep Neural Network Explanation

no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes

In particular, we study how to attribute a DNN's bias to its input features.

Attribute

Paper
Add Code

Jumpout: Improved Dropout for Deep Neural Networks with Rectified Linear Units

no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes

In this paper, we discuss three novel observations about dropout to better understand the generalization of DNNs with rectified linear unit (ReLU) activations: 1) dropout is a smoothing technique that encourages each local linear model of a DNN to be trained on data points from nearby regions; 2) a constant dropout rate can result in effective neural-deactivation rates that are significantly different for layers with different fractions of activated neurons; and 3) the rescaling factor of dropout causes an inconsistency to occur between the normalization during training and testing conditions when batch normalization is also used.

Paper
Add Code

MahiNet: A Neural Network for Many-Class Few-Shot Learning with Class Hierarchy

no code implementations • ICLR 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

It addresses the ``many-class'' problem by exploring the class hierarchy, e. g., the coarse-class label that covers a subset of fine classes, which helps to narrow down the candidates for the fine class and is cheaper to obtain.

Few-Shot Learning General Classification

Paper
Add Code

Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph

2 code implementations • 10 May 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang

The resulting graph of prototypes can be continually re-used and updated for new tasks and classes.

Few-Shot Learning General Classification

Paper
Code

Learning to Propagate for Graph Meta-Learning

1 code implementation • NeurIPS 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

It can significantly improve tasks that suffer from insufficient training data, e. g., few shot learning.

Few-Shot Image Classification Few-Shot Learning

Paper
Code

Dynamic Instance Hardness

no code implementations • 25 Sep 2019 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes

The advantages of DIHCL, compared to other curriculum learning approaches, are: (1) DIHCL does not require additional inference steps over the data not selected by DIHCL in each epoch, (2) the dynamic instance hardness, compared to static instance hardness (e. g., instantaneous loss), is more stable as it integrates information over the entire training history up to the present time.

Paper
Add Code

Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction

no code implementations • 27 Nov 2019 • Yang Li, Guodong Long, Tao Shen, Tianyi Zhou, Lina Yao, Huan Huo, Jing Jiang

Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision.

Entity Embeddings Relation +3

Paper
Add Code

Curriculum-guided Hindsight Experience Replay

1 code implementation • NeurIPS 2019 • Meng Fang, Tianyi Zhou, Yali Du, Lei Han, Zhengyou Zhang

This ``Goal-and-Curiosity-driven Curriculum Learning'' leads to ``Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection.

Paper
Code

Collaborative Inference for Efficient Remote Monitoring

no code implementations • 12 Feb 2020 • Chi Zhang, Yong Sheng Soh, Ling Feng, Tianyi Zhou, Qianxiao Li

While current machine learning models have impressive performance over a wide range of applications, their large size and complexity render them unsuitable for tasks such as remote monitoring on edge devices with limited storage and computational power.

Collaborative Inference

Paper
Add Code

Conditional Self-Attention for Query-based Summarization

no code implementations • 18 Feb 2020 • Yujia Xie, Tianyi Zhou, Yi Mao, Weizhu Chen

Thereby, the contextual dependencies modeled by CSA will be highly relevant to the query.

Paper
Add Code

Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification

3 code implementations • ICLR 2022 • Wensi Tang, Guodong Long, Lu Liu, Tianyi Zhou, Michael Blumenstein, Jing Jiang

Particularly, it is a set of kernel sizes that can efficiently cover the best RF size across different datasets via consisting of multiple prime numbers according to the length of the time series.

General Classification Time Series +2

4,681

Paper
Code

Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion

1 code implementation • 30 Apr 2020 • Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang

In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.

Ranked #4 on Link Prediction on UMLS

Graph Embedding Link Prediction +1

Paper
Code

Multi-Center Federated Learning: Clients Clustering for Better Personalization

3 code implementations • 3 May 2020 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i. e., centers) can better capture the heterogeneity of data distributions across users.

Clustering Federated Learning

Paper
Code

Many-Class Few-Shot Learning on Multi-Granularity Class Hierarchy

1 code implementation • 28 Jun 2020 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings.

Few-Shot Learning

Paper
Code

Attribute Propagation Network for Graph Zero-shot Learning

no code implementations • 24 Sep 2020 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a pre-defined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes.

Attribute Meta-Learning +1

Paper
Add Code

Improving Long-Tail Relation Extraction with Collaborating Relation-Augmented Attention

2 code implementations • COLING 2020 • Yang Li, Tao Shen, Guodong Long, Jing Jiang, Tianyi Zhou, Chengqi Zhang

Then, facilitated by the proposed base model, we introduce collaborating relation features shared among relations in the hierarchies to promote the relation-augmenting process and balance the training data for long-tail relations.

Relation Relation Extraction +1

Paper
Code

Curriculum Learning by Dynamic Instance Hardness

no code implementations • NeurIPS 2020 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes

Compared to existing CL methods: (1) DIH is more stable over time than using only instantaneous hardness, which is noisy due to stochastic training and DNN's non-smoothness; (2) DIHCL is computationally inexpensive since it uses only a byproduct of back-propagation and thus does not require extra inference.

Paper
Add Code

MASP: Model-Agnostic Sample Propagation for Few-shot learning

no code implementations • 1 Jan 2021 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang

Few-shot learning aims to train a classifier given only a few samples per class that are highly insufficient to describe the whole data distribution.

Few-Shot Learning

Paper
Add Code

Robust Curriculum Learning: from clean label detection to noisy label self-correction

no code implementations • ICLR 2021 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes

Neural nets training can easily overfit to noisy labels and end with poor generalization performance.

Paper
Add Code

Extract Local Inference Chains of Deep Neural Nets

no code implementations • 1 Jan 2021 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

In this paper, we introduce an efficient method, \name, to extract the local inference chains by optimizing a differentiable sparse scoring for the filters and layers to preserve the outputs on given data from a local region.

Interpretable Machine Learning Network Pruning

Paper
Add Code

Isometric Propagation Network for Generalized Zero-shot Learning

no code implementations • ICLR 2021 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang

To resolve this problem, we propose Isometric Propagation Network (IPN), which learns to strengthen the relation between classes within each space and align the class dependency in the two spaces.

Generalized Zero-Shot Learning

Paper
Add Code

FedProto: Federated Prototype Learning across Heterogeneous Clients

4 code implementations • 1 May 2021 • Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, Chengqi Zhang

Heterogeneity across clients in federated learning (FL) usually hinders the optimization convergence and generalization performance when the aggregation of clients' knowledge occurs in the gradient space.

Federated Learning

1,147

Paper
Code

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

1 code implementation • ICLR 2021 • Yuchen Jin, Tianyi Zhou, Liangyu Zhao, Yibo Zhu, Chuanxiong Guo, Marco Canini, Arvind Krishnamurthy

This mutual-training process between BO and the loss-prediction model allows us to limit the training steps invested in the BO search.

Image Classification Machine Translation +1

Paper
Code

Multi-Center Federated Learning: Clients Clustering for Better Personalization

1 code implementation • 19 Aug 2021 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

By comparison, a mixture of multiple global models could capture the heterogeneity across various clients if assigning the client to different global models (i. e., centers) in FL.

Clustering Decision Making +1

Paper
Code

Eliminating Sentiment Bias for Aspect-Level Sentiment Classification with Unsupervised Opinion Extraction

1 code implementation • Findings (EMNLP) 2021 • Bo wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang

Aspect-level sentiment classification (ALSC) aims at identifying the sentiment polarity of a specified aspect in a sentence.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3

Paper
Code

EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning

no code implementations • 29 Sep 2021 • Shuang Ao, Tianyi Zhou, Jing Jiang, Guodong Long, Xuan Song, Chengqi Zhang

They are complementary in acquiring more informative feedback for RL: the planning policy provides dense reward of finishing easier sub-tasks while the environment policy modifies these sub-tasks to be adequately challenging and diverse so the RL agent can quickly adapt to different tasks/environments.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Pareto Policy Pool for Model-based Offline Reinforcement Learning

no code implementations • ICLR 2022 • Yijun Yang, Jing Jiang, Tianyi Zhou, Jie Ma, Yuhui Shi

Model-based offline RL instead trains an environment model using a dataset of pre-collected experiences so online RL methods can learn in an offline manner by solely interacting with the model.

D4RL Offline RL +2

Paper
Add Code

Vote for Nearest Neighbors Meta-Pruning of Self-Supervised Networks

no code implementations • 29 Sep 2021 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Liming Zhu, Chengqi Zhang

Can we find a better initialization for a new task, e. g., a much smaller network closer to the final pruned model, by exploiting its similar tasks?

Paper
Add Code

Identity-Disentangled Adversarial Augmentation for Self-supervised Learning

no code implementations • 29 Sep 2021 • Kaiwen Yang, Tianyi Zhou, Xinmei Tian, DaCheng Tao

We then adversarially perturb $G(x)$ in the VAE's bottleneck space and adds it back to the original $R(x)$ as an augmentation, which is therefore sufficiently challenging for contrastive learning and meanwhile preserves the sample identity intact.

Contrastive Learning Data Augmentation +1

Paper
Add Code

Diverse Client Selection for Federated Learning via Submodular Maximization

no code implementations • ICLR 2022 • Ravikumar Balakrishnan, Tian Li, Tianyi Zhou, Nageen Himayat, Virginia Smith, Jeff Bilmes

In every communication round of federated learning, a random subset of clients communicate their model updates back to the server which then aggregates them all.

Fairness Federated Learning

Paper
Add Code

False Correlation Reduction for Offline Reinforcement Learning

1 code implementation • 24 Oct 2021 • Zhihong Deng, Zuyue Fu, Lingxiao Wang, Zhuoran Yang, Chenjia Bai, Tianyi Zhou, Zhaoran Wang, Jing Jiang

Offline reinforcement learning (RL) harnesses the power of massive datasets for resolving sequential decision problems.

D4RL Decision Making +3

Paper
Code

Constrained Robust Submodular Partitioning

no code implementations • NeurIPS 2021 • Shengjie Wang, Tianyi Zhou, Chandrashekhar Lavania, Jeff A. Bilmes

Robust submodular partitioning promotes the diversity of every block in the partition.

Paper
Add Code

CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum

1 code implementation • NeurIPS 2021 • Shuang Ao, Tianyi Zhou, Guodong Long, Qinghua Lu, Liming Zhu, Jing Jiang

Next, a bottom-up traversal of the tree trains the RL agent from easier sub-tasks with denser rewards on bottom layers to harder ones on top layers and collects its cost on each sub-task train the planner in the next episode.

Continuous Control reinforcement-learning +1

Paper
Code

Class-Disentanglement and Applications in Adversarial Detection and Defense

no code implementations • NeurIPS 2021 • Kaiwen Yang, Tianyi Zhou, Yonggang Zhang, Xinmei Tian, DaCheng Tao

In this paper, we propose ''class-disentanglement'' that trains a variational autoencoder $G(\cdot)$ to extract this class-dependent information as $x - G(x)$ via a trade-off between reconstructing $x$ by $G(x)$ and classifying $x$ by $D(x-G(x))$, where the former competes with the latter in decomposing $x$ so the latter retains only necessary information for classification in $x-G(x)$.

Adversarial Defense Disentanglement

Paper
Add Code

Learning To Collaborate in Decentralized Learning of Personalized Models

no code implementations • CVPR 2022 • Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao

Decentralized learning (DL) can exploit the images distributed over devices on a network topology to train a global model but is not designed to train personalized models for different tasks or optimize the topology.

Federated Learning Image Classification

Paper
Add Code

On the Convergence of Clustered Federated Learning

1 code implementation • 13 Feb 2022 • Jie Ma, Guodong Long, Tianyi Zhou, Jing Jiang, Chengqi Zhang

Knowledge sharing and model personalization are essential components to tackle the non-IID challenge in federated learning (FL).

Federated Learning

Paper
Code

Personalized Federated Learning With Graph

1 code implementation • 2 Mar 2022 • Fengwen Chen, Guodong Long, Zonghan Wu, Tianyi Zhou, Jing Jiang

We propose a novel structured federated learning (SFL) framework to learn both the global and personalized models simultaneously using client-wise relation graphs and clients' private data.

Personalized Federated Learning Relation

Paper
Code

Token Dropping for Efficient BERT Pretraining

no code implementations • ACL 2022 • Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou

Transformer-based models generally allocate the same amount of computation for each token in a given sequence.

Language Modelling Masked Language Modeling

Paper
Add Code

FedNoiL: A Simple Two-Level Sampling Method for Federated Learning with Noisy Labels

no code implementations • 20 May 2022 • Zhuowei Wang, Tianyi Zhou, Guodong Long, Bo Han, Jing Jiang

Federated learning (FL) aims at training a global model on the server side while the training data are collected and located at the local devices.

Federated Learning Learning with noisy labels

Paper
Add Code

Phrase-level Textual Adversarial Attack with Label Preservation

1 code implementation • Findings (NAACL) 2022 • Yibin Lei, Yu Cao, Dianqi Li, Tianyi Zhou, Meng Fang, Mykola Pechenizkiy

Generating high-quality textual adversarial examples is critical for investigating the pitfalls of natural language processing (NLP) models and further promoting their robustness.

Adversarial Attack Sentence

Paper
Code

Federated Learning from Pre-Trained Models: A Contrastive Learning Approach

2 code implementations • 21 Sep 2022 • Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, Jing Jiang

To prevent these issues from hindering the deployment of FL systems, we propose a lightweight framework where clients jointly learn to fuse the representations generated by multiple fixed pre-trained models rather than training a large-scale model from scratch.

Contrastive Learning Federated Learning

1,147

Paper
Code

TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack

1 code implementation • 27 Oct 2022 • Yu Cao, Dianqi Li, Meng Fang, Tianyi Zhou, Jun Gao, Yibing Zhan, DaCheng Tao

We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers.

Adversarial Attack Question Answering +1

Paper
Code

Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach

1 code implementation • 2 Nov 2022 • Kaiwen Yang, Yanchao Sun, Jiahao Su, Fengxiang He, Xinmei Tian, Furong Huang, Tianyi Zhou, DaCheng Tao

In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with strong pre-defined augmentations, e. g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly.

Data Augmentation Representation Learning

Paper
Code

Dual Personalization on Federated Recommendation

1 code implementation • 16 Jan 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijian Zhang, Chengqi Zhang, Bo Yang

Moreover, we provide visualizations and in-depth analysis of the personalization techniques in item embedding, which shed novel insights on the design of recommender systems in federated settings.

Privacy Preserving Recommendation Systems

Paper
Code

Federated Recommendation with Additive Personalization

1 code implementation • 22 Jan 2023 • Zhiwei Li, Guodong Long, Tianyi Zhou

To address these challenges, we propose Federated Recommendation with Additive Personalization (FedRAP), which learns a global view of items via FL and a personalized view locally on each user.

Federated Learning Recommendation Systems

Paper
Code

Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for Downstream Tasks

no code implementations • 27 Jan 2023 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

To address these challenges, we create a small model for a new task from the pruned models of similar tasks.

Paper
Add Code

How Many Demonstrations Do You Need for In-context Learning?

no code implementations • 14 Mar 2023 • Jiuhai Chen, Lichang Chen, Chen Zhu, Tianyi Zhou

Moreover, ICL (with and w/o CoT) using only one correct demo significantly outperforms all-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding correct demo(s) for input queries, which is difficult to evaluate on the biased datasets.

In-Context Learning

Paper
Add Code

Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models

3 code implementations • 15 Mar 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha

Aerial Diffusion leverages a pretrained text-image diffusion model for prior knowledge.

Paper
Code

Solving Regularized Exp, Cosh and Sinh Regression Problems

no code implementations • 28 Mar 2023 • Zhihang Li, Zhao Song, Tianyi Zhou

In this paper, we make use of the input sparsity and purpose an algorithm that use $\log ( \|x_0 - x^*\|_2 / \epsilon)$ iterations and $\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} )$ per iteration time to solve the problem.

regression

Paper
Add Code

When do you need Chain-of-Thought Prompting for ChatGPT?

no code implementations • 6 Apr 2023 • Jiuhai Chen, Lichang Chen, Heng Huang, Tianyi Zhou

However, it is not clear whether CoT is still effective on more recent instruction finetuned (IFT) LLMs such as ChatGPT.

Arithmetic Reasoning Memorization

Paper
Add Code

Does Continual Learning Equally Forget All Parameters?

no code implementations • 9 Apr 2023 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

In this paper, we study which modules in neural networks are more prone to forgetting by investigating their training dynamics during CL.

Attribute Continual Learning

Paper
Add Code

The Closeness of In-Context Learning and Weight Shifting for Softmax Regression

no code implementations • 26 Apr 2023 • Shuai Li, Zhao Song, Yu Xia, Tong Yu, Tianyi Zhou

Large language models (LLMs) are known for their exceptional performance in natural language processing, making them highly effective in many human life-related or even job-related tasks.

In-Context Learning regression

Paper
Add Code

Large Language Models are Strong Zero-Shot Retriever

no code implementations • 27 Apr 2023 • Tao Shen, Guodong Long, Xiubo Geng, Chongyang Tao, Tianyi Zhou, Daxin Jiang

In this work, we propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios.

Language Modelling Large Language Model +1

Paper
Add Code

Graph-guided Personalization for Federated Recommendation

no code implementations • 13 May 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijjian Zhang, Bo Yang

Federated Recommendation is a new service architecture providing recommendations without sharing user data with the server.

Paper
Add Code

When Federated Recommendation Meets Cold-Start Problem: Separating Item Attributes and User Interactions

no code implementations • 22 May 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijian Zhang, Peng Yan, Bo Yang

However, this separation of the recommendation model and users' private data poses a challenge in providing quality service, particularly when it comes to new items, namely cold-start recommendations in federated settings.

Attribute Federated Learning +1

Paper
Add Code

Learning UI-to-Code Reverse Generator Using Visual Critic Without Rendering

no code implementations • 24 May 2023 • Davit Soselia, Khalid Saifullah, Tianyi Zhou

We evaluate the UI-to-Code performance using a combination of automated metrics such as MSE, BLEU, IoU, and a novel htmlBLEU score.

Code Generation reinforcement-learning

Paper
Add Code

Condensed Prototype Replay for Class Incremental Learning

no code implementations • 25 May 2023 • Jiangtao Kong, Zhenyu Zong, Tianyi Zhou, Huajie Shao

In this paper, we propose YONO that You Only Need to replay One condensed prototype per class, which for the first time can even outperform memory-costly exemplar-replay methods.

Class Incremental Learning Incremental Learning

Paper
Add Code

Continual Task Allocation in Meta-Policy Network via Sparse Prompting

1 code implementation • 29 May 2023 • Yijun Yang, Tianyi Zhou, Jing Jiang, Guodong Long, Yuhui Shi

We address it by "Continual Task Allocation via Sparse Prompting (CoTASP)", which learns over-complete dictionaries to produce sparse masks as prompts extracting a sub-network for each task from a meta-policy network.

Paper
Code

A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models

no code implementations • 4 Jun 2023 • Ritwik Sinha, Zhao Song, Tianyi Zhou

A model trained on these losses balances the trade-off between the creativity and reality of the model.

Chatbot Question Answering +1

Paper
Add Code

InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

1 code implementation • 5 Jun 2023 • Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou

Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden.

Bayesian Optimization

162

Paper
Code

Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation

no code implementations • 9 Jun 2023 • Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha

Trajectory length stands as a crucial hyperparameter within reinforcement learning (RL) algorithms, significantly contributing to the sample inefficiency in robotics applications.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Structured Cooperative Learning with Graphical Model Priors

1 code implementation • 16 Jun 2023 • Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao

We propose "Structured Cooperative Learning (SCooL)", in which a cooperation graph across devices is generated by a graphical model prior to automatically coordinate mutual learning between devices.

Stochastic Block Model Variational Inference

Paper
Code

Taming Small-sample Bias in Low-budget Active Learning

no code implementations • 19 Jun 2023 • Linxin Song, Jieyu Zhang, Xiaotian Lu, Tianyi Zhou

Instead of tuning the coefficient for each query round, which is sensitive and time-consuming, we propose the curriculum Firth bias reduction (CHAIN) that can automatically adjust the coefficient to be adaptive to the training process.

Active Learning

Paper
Add Code

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

277

Paper
Code

Subclass-balancing Contrastive Learning for Long-tailed Recognition

1 code implementation • ICCV 2023 • Chengkai Hou, Jieyu Zhang, Haonan Wang, Tianyi Zhou

We overcome these drawbacks by a novel ``subclass-balancing contrastive learning (SBCL)'' approach that clusters each head class into multiple subclasses of similar sizes as the tail classes and enforce representations to capture the two-layer class hierarchy between the original classes and their subclasses.

Contrastive Learning Representation Learning

Paper
Code

Eigensubspace of Temporal-Difference Dynamics and How It Improves Value Approximation in Reinforcement Learning

no code implementations • 29 Jun 2023 • Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi

In ERC, we propose a regularizer that guides the approximation error tending towards the 1-eigensubspace, resulting in a more efficient and stable path of value approximation.

Reinforcement Learning (RL)

Paper
Add Code

Diffusion Models Beat GANs on Image Classification

1 code implementation • 17 Jul 2023 • Soumik Mukhopadhyay, Matthew Gwilliam, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Srinidhi Hegde, Tianyi Zhou, Abhinav Shrivastava

We explore optimal methods for extracting and using these embeddings for classification tasks, demonstrating promising results on the ImageNet classification task.

Classification Denoising +5

Paper
Code

AlpaGasus: Training A Better Alpaca with Fewer Data

3 code implementations • 17 Jul 2023 • Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin

Large language models (LLMs) strengthen instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data.

Instruction Following

162

Paper
Code

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

2 code implementations • 23 Aug 2023 • Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, Jing Xiao

In the realm of Large Language Models (LLMs), the balance between instruction data quality and quantity is a focal point.

Instruction Following

185

Paper
Code

MerA: Merging Pretrained Adapters For Few-Shot Learning

no code implementations • 30 Aug 2023 • Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.

Few-Shot Learning MRPC

Paper
Add Code

When to Learn What: Model-Adaptive Data Augmentation Curriculum

1 code implementation • ICCV 2023 • Chengkai Hou, Jieyu Zhang, Tianyi Zhou

Unlike previous work, MADAug selects augmentation operators for each input image by a model-adaptive policy varying between training stages, producing a data augmentation curriculum optimized for better generalization.

Data Augmentation Fairness +1

Paper
Code

Curriculum Reinforcement Learning via Morphology-Environment Co-Evolution

no code implementations • 21 Sep 2023 • Shuang Ao, Tianyi Zhou, Guodong Long, Xuan Song, Jing Jiang

Throughout long history, natural species have learned to survive by evolving their physical structures adaptive to the environment changes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

NLPBench: Evaluating Large Language Models on Solving NLP Problems

1 code implementation • 27 Sep 2023 • Linxin Song, Jieyu Zhang, Lechao Cheng, Pengyuan Zhou, Tianyi Zhou, Irene Li

Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP).

Benchmarking Math

Paper
Code

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

1 code implementation • 15 Oct 2023 • Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.

Computational Efficiency

Paper
Code

Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention

no code implementations • 18 Oct 2023 • Yichuan Deng, Zhao Song, Tianyi Zhou

Large transformer models have achieved state-of-the-art results in numerous natural language processing tasks.

Paper
Add Code

Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning

2 code implementations • 18 Oct 2023 • Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Heng Huang, Jiuxiang Gu, Tianyi Zhou

Recent advancements in Large Language Models (LLMs) have expanded the horizons of natural language understanding and generation.

Natural Language Understanding

Paper
Code

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models

5 code implementations • 23 Oct 2023 • Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou

Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs.

Ranked #1 on Visual Question Answering (VQA) on HallusionBench

Hallucination Visual Question Answering (VQA)

210

Paper
Code

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

1 code implementation • 26 Oct 2023 • Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.

In-Context Learning

209

Paper
Code

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

no code implementations • 19 Nov 2023 • Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou

In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i. e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that $\forall i \in [k], \langle a_i, b_i \rangle \geq \rho \cdot d$, for a threshold $\rho \in (0, 1)$, the goal is to identify those $k$ heavy inner products.

Paper
Add Code

HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View

2 code implementations • 27 Nov 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha

It seamlessly blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model with a test-time optimization process for a careful bias-variance trade-off, which uses an Inverse Perspective Mapping (IPM) homography transformation to provide subtle cues for aerialview synthesis.

Novel View Synthesis

Paper
Code

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld

1 code implementation • 28 Nov 2023 • Yijun Yang, Tianyi Zhou, Kanxue Li, Dapeng Tao, Lusong Li, Li Shen, Xiaodong He, Jing Jiang, Yuhui Shi

While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals.

Imitation Learning

Paper
Code

Do text-free diffusion models learn discriminative visual representations?

1 code implementation • 29 Nov 2023 • Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Tianyi Zhou, Abhinav Shrivastava

We find that the intermediate feature maps of the U-Net are diverse, discriminative feature representations.

Image Classification object-detection +3

Paper
Code

Good Questions Help Zero-Shot Image Reasoning

1 code implementation • 4 Dec 2023 • Kaiwen Yang, Tao Shen, Xinmei Tian, Xiubo Geng, Chongyang Tao, DaCheng Tao, Tianyi Zhou

QVix enables a wider exploration of visual scenes, improving the LVLMs' reasoning accuracy and depth in tasks such as visual question answering and visual entailment.

Fine-Grained Image Classification Question Answering +2

Paper
Code

TrustLLM: Trustworthiness in Large Language Models

1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.

Ethics Fairness

271

Paper
Code

Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

no code implementations • 17 Jan 2024 • Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz, Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang, Qizhen Zhang, Tianyi Zhou, Danyang Zhuo

In this paper, we investigate the intersection of large generative AI models and cloud-native computing architectures.

Paper
Add Code

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

1 code implementation • 1 Feb 2024 • Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou

Data filtering for instruction tuning has proved important in improving both the efficiency and performance of the tuning process.

Language Modelling

Paper
Code

ODIN: Disentangled Reward Mitigates Hacking in RLHF

no code implementations • 11 Feb 2024 • Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro

In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs.

Paper
Add Code

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic

no code implementations • 12 Feb 2024 • Jiuxiang Gu, Chenyang Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Tianyi Zhou

Our research presents a thorough analytical characterization of the features learned by stylized one-hidden layer neural networks and one-layer Transformers in addressing this task.

2k Mathematical Reasoning

Paper
Add Code

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

2 code implementations • 15 Feb 2024 • Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Jiuxiang Gu, Tianyi Zhou

Instruction tuning is critical to large language models (LLMs) for achieving better instruction following and task adaptation capabilities but its success heavily relies on the training data quality.

Data Augmentation Instruction Following

Paper
Code

Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements

1 code implementation • 16 Feb 2024 • Ming Li, Jiuhai Chen, Lichang Chen, Tianyi Zhou

To examine DEBATunE, we curate the largest dataset of debate topics so far, which covers 710 controversial topics and corresponding arguments for each topic.

Paper
Code

MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion

1 code implementation • 20 Feb 2024 • Sen Li, Ruochen Wang, Cho-Jui Hsieh, Minhao Cheng, Tianyi Zhou

Moreover, MuLan adopts a vision-language model (VLM) to provide feedback to the image generated in each sub-task and control the diffusion model to re-generate the image if it violates the original prompt.

Attribute Language Modelling +2

Paper
Code

A Survey on Knowledge Distillation of Large Language Models

1 code implementation • 20 Feb 2024 • Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, DaCheng Tao, Tianyi Zhou

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.

Data Augmentation Knowledge Distillation +1

231

Paper
Code

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers

1 code implementation • 25 Feb 2024 • Xirui Li, Ruochen Wang, Minhao Cheng, Tianyi Zhou, Cho-Jui Hsieh

DrAttack includes three key components: (a) `Decomposition' of the original prompt into sub-prompts, (b) `Reconstruction' of these sub-prompts implicitly by in-context learning with semantically similar but harmless reassembling demo, and (c) a `Synonym Search' of sub-prompts, aiming to find sub-prompts' synonyms that maintain the original intent while jailbreaking LLMs.

In-Context Learning

Paper
Code

Meta-Task Prompting Elicits Embedding from Large Language Models

no code implementations • 28 Feb 2024 • Yibin Lei, Di wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates

In this work, we introduce a new unsupervised embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from Large Language Models (LLMs) without the need for model fine-tuning or task-specific engineering.

Semantic Textual Similarity Sentence +2

Paper
Add Code

Corpus-Steered Query Expansion with Large Language Models

1 code implementation • 28 Feb 2024 • Yibin Lei, Yu Cao, Tianyi Zhou, Tao Shen, Andrew Yates

Recent studies demonstrate that query expansions generated by large language models (LLMs) can considerably enhance information retrieval systems by generating hypothetical documents that answer the queries as expansions.

Information Retrieval Retrieval

Paper
Code

Many-Objective Multi-Solution Transport

no code implementations • 6 Mar 2024 • Ziyue Li, Tian Li, Virginia Smith, Jeff Bilmes, Tianyi Zhou

Optimizing the performance of many objectives (instantiated by tasks or clients) jointly with a few Pareto stationary solutions (models) is critical in machine learning.

Federated Learning Multi-Task Learning

Paper
Add Code

Simplicity Bias of Transformers to Learn Low Sensitivity Functions

no code implementations • 11 Mar 2024 • Bhavya Vasudeva, Deqing Fu, Tianyi Zhou, Elliott Kau, Youqi Huang, Vatsal Sharan

Transformers achieve state-of-the-art accuracy and robustness across many tasks, but an understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.

Paper
Add Code

Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation

1 code implementation • 19 Apr 2024 • Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi

We then leverage this upper bound to propose a novel regularizer, namely BEllman Equation-based automatic rank Regularizer (BEER).

Paper
Code

Time-Consistent Self-Supervision for Semi-Supervised Learning

no code implementations • ICML 2020 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes

In this paper, we study the dynamics of neural net outputs in SSL and show that selecting and using first the unlabeled samples with more consistent outputs over the course of training (i. e., "time-consistency") can improve the final test accuracy and save computation.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.