Search Results for author: Tianyi Zhou

Found 109 papers, 53 papers with code

Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification

3 code implementations ICLR 2022 Wensi Tang, Guodong Long, Lu Liu, Tianyi Zhou, Michael Blumenstein, Jing Jiang

Particularly, it is a set of kernel sizes that can efficiently cover the best RF size across different datasets via consisting of multiple prime numbers according to the length of the time series.

General Classification Time Series +2

FedProto: Federated Prototype Learning across Heterogeneous Clients

5 code implementations1 May 2021 Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, Chengqi Zhang

Heterogeneity across clients in federated learning (FL) usually hinders the optimization convergence and generalization performance when the aggregation of clients' knowledge occurs in the gradient space.

Federated Learning

Federated Learning from Pre-Trained Models: A Contrastive Learning Approach

2 code implementations21 Sep 2022 Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, Jing Jiang

To prevent these issues from hindering the deployment of FL systems, we propose a lightweight framework where clients jointly learn to fuse the representations generated by multiple fixed pre-trained models rather than training a large-scale model from scratch.

Contrastive Learning Federated Learning

Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together

2 code implementations NAACL 2019 Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

Neural networks equipped with self-attention have parallelizable computation, light-weight structure, and the ability to capture both long-range and local dependencies.

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

1 code implementation31 Jan 2018 Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang

In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other.

Hard Attention Natural Language Inference +1

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

3 code implementations14 Sep 2017 Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, Chengqi Zhang

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively.

Natural Language Inference Sentence +2

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

1 code implementation24 Jun 2023 Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

1 code implementation26 Oct 2023 Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.

In-Context Learning

InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

1 code implementation5 Jun 2023 Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou

Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden.

Bayesian Optimization

AlpaGasus: Training A Better Alpaca with Fewer Data

3 code implementations17 Jul 2023 Lichang Chen, Shiyang Li, Jun Yan, Hai Wang, Kalpa Gunaratna, Vikas Yadav, Zheng Tang, Vijay Srinivasan, Tianyi Zhou, Heng Huang, Hongxia Jin

Large language models (LLMs) strengthen instruction-following capability through instruction-finetuning (IFT) on supervised instruction/response data.

Instruction Following

A Survey on Knowledge Distillation of Large Language Models

1 code implementation20 Feb 2024 Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, DaCheng Tao, Tianyi Zhou

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.

Data Augmentation Knowledge Distillation +1

Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling

1 code implementation ICLR 2018 Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

In this paper, we propose a model, called "bi-directional block self-attention network (Bi-BloSAN)", for RNN/CNN-free sequence encoding.

Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion

1 code implementation30 Apr 2020 Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang

In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.

Graph Embedding Link Prediction +1

Curriculum-guided Hindsight Experience Replay

1 code implementation NeurIPS 2019 Meng Fang, Tianyi Zhou, Yali Du, Lei Han, Zhengyou Zhang

This ``Goal-and-Curiosity-driven Curriculum Learning'' leads to ``Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection.

Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning

2 code implementations18 Oct 2023 Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Heng Huang, Jiuxiang Gu, Tianyi Zhou

Recent advancements in Large Language Models (LLMs) have expanded the horizons of natural language understanding and generation.

Natural Language Understanding

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

2 code implementations15 Feb 2024 Ming Li, Lichang Chen, Jiuhai Chen, Shwai He, Jiuxiang Gu, Tianyi Zhou

Instruction tuning is critical to large language models (LLMs) for achieving better instruction following and task adaptation capabilities but its success heavily relies on the training data quality.

Data Augmentation Instruction Following

Personalized Federated Learning With Graph

1 code implementation2 Mar 2022 Fengwen Chen, Guodong Long, Zonghan Wu, Tianyi Zhou, Jing Jiang

We propose a novel structured federated learning (SFL) framework to learn both the global and personalized models simultaneously using client-wise relation graphs and clients' private data.

Personalized Federated Learning Relation

Multi-Center Federated Learning: Clients Clustering for Better Personalization

3 code implementations3 May 2020 Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i. e., centers) can better capture the heterogeneity of data distributions across users.

Clustering Federated Learning

MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion

1 code implementation20 Feb 2024 Sen Li, Ruochen Wang, Cho-Jui Hsieh, Minhao Cheng, Tianyi Zhou

Moreover, MuLan adopts a vision-language model (VLM) to provide feedback to the image generated in each sub-task and control the diffusion model to re-generate the image if it violates the original prompt.

Attribute Language Modelling +2

Multi-Center Federated Learning: Clients Clustering for Better Personalization

1 code implementation19 Aug 2021 Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

By comparison, a mixture of multiple global models could capture the heterogeneity across various clients if assigning the client to different global models (i. e., centers) in FL.

Clustering Decision Making +1

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

1 code implementation1 Feb 2024 Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou

Data filtering for instruction tuning has proved important in improving both the efficiency and performance of the tuning process.

Language Modelling

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

1 code implementation15 Oct 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.

Computational Efficiency

When to Learn What: Model-Adaptive Data Augmentation Curriculum

1 code implementation ICCV 2023 Chengkai Hou, Jieyu Zhang, Tianyi Zhou

Unlike previous work, MADAug selects augmentation operators for each input image by a model-adaptive policy varying between training stages, producing a data augmentation curriculum optimized for better generalization.

Data Augmentation Fairness +1

Dual Personalization on Federated Recommendation

1 code implementation16 Jan 2023 Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijian Zhang, Chengqi Zhang, Bo Yang

Moreover, we provide visualizations and in-depth analysis of the personalization techniques in item embedding, which shed novel insights on the design of recommender systems in federated settings.

Privacy Preserving Recommendation Systems

Many-Class Few-Shot Learning on Multi-Granularity Class Hierarchy

1 code implementation28 Jun 2020 Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings.

Few-Shot Learning

On the Convergence of Clustered Federated Learning

1 code implementation13 Feb 2022 Jie Ma, Guodong Long, Tianyi Zhou, Jing Jiang, Chengqi Zhang

Knowledge sharing and model personalization are essential components to tackle the non-IID challenge in federated learning (FL).

Federated Learning

False Correlation Reduction for Offline Reinforcement Learning

1 code implementation24 Oct 2021 Zhihong Deng, Zuyue Fu, Lingxiao Wang, Zhuoran Yang, Chenjia Bai, Tianyi Zhou, Zhaoran Wang, Jing Jiang

Offline reinforcement learning (RL) harnesses the power of massive datasets for resolving sequential decision problems.

D4RL Decision Making +3

Subclass-balancing Contrastive Learning for Long-tailed Recognition

1 code implementation ICCV 2023 Chengkai Hou, Jieyu Zhang, Haonan Wang, Tianyi Zhou

We overcome these drawbacks by a novel ``subclass-balancing contrastive learning (SBCL)'' approach that clusters each head class into multiple subclasses of similar sizes as the tail classes and enforce representations to capture the two-layer class hierarchy between the original classes and their subclasses.

Contrastive Learning Representation Learning

Diffusion Models Beat GANs on Image Classification

1 code implementation17 Jul 2023 Soumik Mukhopadhyay, Matthew Gwilliam, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Srinidhi Hegde, Tianyi Zhou, Abhinav Shrivastava

We explore optimal methods for extracting and using these embeddings for classification tasks, demonstrating promising results on the ImageNet classification task.

Classification Denoising +5

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld

1 code implementation28 Nov 2023 Yijun Yang, Tianyi Zhou, Kanxue Li, Dapeng Tao, Lusong Li, Li Shen, Xiaodong He, Jing Jiang, Yuhui Shi

While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals.

Imitation Learning

Good Questions Help Zero-Shot Image Reasoning

1 code implementation4 Dec 2023 Kaiwen Yang, Tao Shen, Xinmei Tian, Xiubo Geng, Chongyang Tao, DaCheng Tao, Tianyi Zhou

QVix enables a wider exploration of visual scenes, improving the LVLMs' reasoning accuracy and depth in tasks such as visual question answering and visual entailment.

Fine-Grained Image Classification Question Answering +2

Phrase-level Textual Adversarial Attack with Label Preservation

1 code implementation Findings (NAACL) 2022 Yibin Lei, Yu Cao, Dianqi Li, Tianyi Zhou, Meng Fang, Mykola Pechenizkiy

Generating high-quality textual adversarial examples is critical for investigating the pitfalls of natural language processing (NLP) models and further promoting their robustness.

Adversarial Attack Sentence

NLPBench: Evaluating Large Language Models on Solving NLP Problems

1 code implementation27 Sep 2023 Linxin Song, Jieyu Zhang, Lechao Cheng, Pengyuan Zhou, Tianyi Zhou, Irene Li

Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP).

Benchmarking Math

Federated Recommendation with Additive Personalization

1 code implementation22 Jan 2023 Zhiwei Li, Guodong Long, Tianyi Zhou

To address these challenges, we propose Federated Recommendation with Additive Personalization (FedRAP), which learns a global view of items via FL and a personalized view locally on each user.

Federated Learning Recommendation Systems

Continual Task Allocation in Meta-Policy Network via Sparse Prompting

1 code implementation29 May 2023 Yijun Yang, Tianyi Zhou, Jing Jiang, Guodong Long, Yuhui Shi

We address it by "Continual Task Allocation via Sparse Prompting (CoTASP)", which learns over-complete dictionaries to produce sparse masks as prompts extracting a sub-network for each task from a meta-policy network.

Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements

1 code implementation16 Feb 2024 Ming Li, Jiuhai Chen, Lichang Chen, Tianyi Zhou

To examine DEBATunE, we curate the largest dataset of debate topics so far, which covers 710 controversial topics and corresponding arguments for each topic.

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers

1 code implementation25 Feb 2024 Xirui Li, Ruochen Wang, Minhao Cheng, Tianyi Zhou, Cho-Jui Hsieh

DrAttack includes three key components: (a) `Decomposition' of the original prompt into sub-prompts, (b) `Reconstruction' of these sub-prompts implicitly by in-context learning with semantically similar but harmless reassembling demo, and (c) a `Synonym Search' of sub-prompts, aiming to find sub-prompts' synonyms that maintain the original intent while jailbreaking LLMs.

In-Context Learning

Improving Long-Tail Relation Extraction with Collaborating Relation-Augmented Attention

2 code implementations COLING 2020 Yang Li, Tao Shen, Guodong Long, Jing Jiang, Tianyi Zhou, Chengqi Zhang

Then, facilitated by the proposed base model, we introduce collaborating relation features shared among relations in the hierarchies to promote the relation-augmenting process and balance the training data for long-tail relations.

Relation Relation Extraction +1

Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach

1 code implementation2 Nov 2022 Kaiwen Yang, Yanchao Sun, Jiahao Su, Fengxiang He, Xinmei Tian, Furong Huang, Tianyi Zhou, DaCheng Tao

In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with strong pre-defined augmentations, e. g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly.

Data Augmentation Representation Learning

CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum

1 code implementation NeurIPS 2021 Shuang Ao, Tianyi Zhou, Guodong Long, Qinghua Lu, Liming Zhu, Jing Jiang

Next, a bottom-up traversal of the tree trains the RL agent from easier sub-tasks with denser rewards on bottom layers to harder ones on top layers and collects its cost on each sub-task train the planner in the next episode.

Continuous Control reinforcement-learning +1

TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack

1 code implementation27 Oct 2022 Yu Cao, Dianqi Li, Meng Fang, Tianyi Zhou, Jun Gao, Yibing Zhan, DaCheng Tao

We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers.

Adversarial Attack Question Answering +1

Structured Cooperative Learning with Graphical Model Priors

1 code implementation16 Jun 2023 Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao

We propose "Structured Cooperative Learning (SCooL)", in which a cooperation graph across devices is generated by a graphical model prior to automatically coordinate mutual learning between devices.

Stochastic Block Model Variational Inference

HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View

2 code implementations27 Nov 2023 Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha

It seamlessly blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model with a test-time optimization process for a careful bias-variance trade-off, which uses an Inverse Perspective Mapping (IPM) homography transformation to provide subtle cues for aerialview synthesis.

Novel View Synthesis

Stream Clipper: Scalable Submodular Maximization on Stream

no code implementations1 Jun 2016 Tianyi Zhou, Jeff Bilmes

We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice.

Video Summarization

Scaling Submodular Maximization via Pruned Submodularity Graphs

no code implementations1 Jun 2016 Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin

We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.

Video Summarization

Divide-and-Conquer Learning by Anchoring a Conical Hull

no code implementations NeurIPS 2014 Tianyi Zhou, Jeff Bilmes, Carlos Guestrin

We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.

Clustering

Unmixing Incoherent Structures of Big Data by Randomized or Greedy Decomposition

no code implementations2 Sep 2013 Tianyi Zhou, DaCheng Tao

Learning big data by matrix decomposition always suffers from expensive computation, mixing of complicated structures and noise.

Diverse Ensemble Evolution: Curriculum Data-Model Marriage

no code implementations NeurIPS 2018 Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes

We study a new method (``Diverse Ensemble Evolution (DivE$^2$)'') to train an ensemble of machine learning models that assigns data to models at each training epoch based on each model's current expertise and an intra- and inter-model diversity reward.

MahiNet: A Neural Network for Many-Class Few-Shot Learning with Class Hierarchy

no code implementations ICLR 2019 Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

It addresses the ``many-class'' problem by exploring the class hierarchy, e. g., the coarse-class label that covers a subset of fine classes, which helps to narrow down the candidates for the fine class and is cheaper to obtain.

Few-Shot Learning General Classification

Jumpout: Improved Dropout for Deep Neural Networks with Rectified Linear Units

no code implementations ICLR 2019 Shengjie Wang, Tianyi Zhou, Jeff Bilmes

In this paper, we discuss three novel observations about dropout to better understand the generalization of DNNs with rectified linear unit (ReLU) activations: 1) dropout is a smoothing technique that encourages each local linear model of a DNN to be trained on data points from nearby regions; 2) a constant dropout rate can result in effective neural-deactivation rates that are significantly different for layers with different fractions of activated neurons; and 3) the rescaling factor of dropout causes an inconsistency to occur between the normalization during training and testing conditions when batch normalization is also used.

Minimax Curriculum Learning: Machine Teaching with Desirable Difficulties and Scheduled Diversity

no code implementations ICLR 2018 Tianyi Zhou, Jeff Bilmes

We introduce and study minimax curriculum learning (MCL), a new method for adaptively selecting a sequence of training subsets for a succession of stages in machine learning.

Clustering

Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction

no code implementations27 Nov 2019 Yang Li, Guodong Long, Tao Shen, Tianyi Zhou, Lina Yao, Huan Huo, Jing Jiang

Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision.

Entity Embeddings Relation +3

Collaborative Inference for Efficient Remote Monitoring

no code implementations12 Feb 2020 Chi Zhang, Yong Sheng Soh, Ling Feng, Tianyi Zhou, Qianxiao Li

While current machine learning models have impressive performance over a wide range of applications, their large size and complexity render them unsuitable for tasks such as remote monitoring on edge devices with limited storage and computational power.

Collaborative Inference

Conditional Self-Attention for Query-based Summarization

no code implementations18 Feb 2020 Yujia Xie, Tianyi Zhou, Yi Mao, Weizhu Chen

Thereby, the contextual dependencies modeled by CSA will be highly relevant to the query.

Time-Consistent Self-Supervision for Semi-Supervised Learning

no code implementations ICML 2020 Tianyi Zhou, Shengjie Wang, Jeff Bilmes

In this paper, we study the dynamics of neural net outputs in SSL and show that selecting and using first the unlabeled samples with more consistent outputs over the course of training (i. e., "time-consistency") can improve the final test accuracy and save computation.

Attribute Propagation Network for Graph Zero-shot Learning

no code implementations24 Sep 2020 Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a pre-defined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes.

Attribute Meta-Learning +1

Robust Curriculum Learning: from clean label detection to noisy label self-correction

no code implementations ICLR 2021 Tianyi Zhou, Shengjie Wang, Jeff Bilmes

Neural nets training can easily overfit to noisy labels and end with poor generalization performance.

Extract Local Inference Chains of Deep Neural Nets

no code implementations1 Jan 2021 Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

In this paper, we introduce an efficient method, \name, to extract the local inference chains by optimizing a differentiable sparse scoring for the filters and layers to preserve the outputs on given data from a local region.

Interpretable Machine Learning Network Pruning

MASP: Model-Agnostic Sample Propagation for Few-shot learning

no code implementations1 Jan 2021 Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang

Few-shot learning aims to train a classifier given only a few samples per class that are highly insufficient to describe the whole data distribution.

Few-Shot Learning

Curriculum Learning by Dynamic Instance Hardness

no code implementations NeurIPS 2020 Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes

Compared to existing CL methods: (1) DIH is more stable over time than using only instantaneous hardness, which is noisy due to stochastic training and DNN's non-smoothness; (2) DIHCL is computationally inexpensive since it uses only a byproduct of back-propagation and thus does not require extra inference.

Isometric Propagation Network for Generalized Zero-shot Learning

no code implementations ICLR 2021 Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang

To resolve this problem, we propose Isometric Propagation Network (IPN), which learns to strengthen the relation between classes within each space and align the class dependency in the two spaces.

Generalized Zero-Shot Learning

Vote for Nearest Neighbors Meta-Pruning of Self-Supervised Networks

no code implementations29 Sep 2021 Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Liming Zhu, Chengqi Zhang

Can we find a better initialization for a new task, e. g., a much smaller network closer to the final pruned model, by exploiting its similar tasks?

Pareto Policy Pool for Model-based Offline Reinforcement Learning

no code implementations ICLR 2022 Yijun Yang, Jing Jiang, Tianyi Zhou, Jie Ma, Yuhui Shi

Model-based offline RL instead trains an environment model using a dataset of pre-collected experiences so online RL methods can learn in an offline manner by solely interacting with the model.

D4RL Offline RL +2

EAT-C: Environment-Adversarial sub-Task Curriculum for Efficient Reinforcement Learning

no code implementations29 Sep 2021 Shuang Ao, Tianyi Zhou, Jing Jiang, Guodong Long, Xuan Song, Chengqi Zhang

They are complementary in acquiring more informative feedback for RL: the planning policy provides dense reward of finishing easier sub-tasks while the environment policy modifies these sub-tasks to be adequately challenging and diverse so the RL agent can quickly adapt to different tasks/environments.

reinforcement-learning Reinforcement Learning (RL)

Identity-Disentangled Adversarial Augmentation for Self-supervised Learning

no code implementations29 Sep 2021 Kaiwen Yang, Tianyi Zhou, Xinmei Tian, DaCheng Tao

We then adversarially perturb $G(x)$ in the VAE's bottleneck space and adds it back to the original $R(x)$ as an augmentation, which is therefore sufficiently challenging for contrastive learning and meanwhile preserves the sample identity intact.

Contrastive Learning Data Augmentation +1

Diverse Client Selection for Federated Learning via Submodular Maximization

no code implementations ICLR 2022 Ravikumar Balakrishnan, Tian Li, Tianyi Zhou, Nageen Himayat, Virginia Smith, Jeff Bilmes

In every communication round of federated learning, a random subset of clients communicate their model updates back to the server which then aggregates them all.

Fairness Federated Learning

Class-Disentanglement and Applications in Adversarial Detection and Defense

no code implementations NeurIPS 2021 Kaiwen Yang, Tianyi Zhou, Yonggang Zhang, Xinmei Tian, DaCheng Tao

In this paper, we propose ''class-disentanglement'' that trains a variational autoencoder $G(\cdot)$ to extract this class-dependent information as $x - G(x)$ via a trade-off between reconstructing $x$ by $G(x)$ and classifying $x$ by $D(x-G(x))$, where the former competes with the latter in decomposing $x$ so the latter retains only necessary information for classification in $x-G(x)$.

Adversarial Defense Disentanglement

Dynamic Instance Hardness

no code implementations25 Sep 2019 Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes

The advantages of DIHCL, compared to other curriculum learning approaches, are: (1) DIHCL does not require additional inference steps over the data not selected by DIHCL in each epoch, (2) the dynamic instance hardness, compared to static instance hardness (e. g., instantaneous loss), is more stable as it integrates information over the entire training history up to the present time.

FedNoiL: A Simple Two-Level Sampling Method for Federated Learning with Noisy Labels

no code implementations20 May 2022 Zhuowei Wang, Tianyi Zhou, Guodong Long, Bo Han, Jing Jiang

Federated learning (FL) aims at training a global model on the server side while the training data are collected and located at the local devices.

Federated Learning Learning with noisy labels

Learning To Collaborate in Decentralized Learning of Personalized Models

no code implementations CVPR 2022 Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao

Decentralized learning (DL) can exploit the images distributed over devices on a network topology to train a global model but is not designed to train personalized models for different tasks or optimize the topology.

Federated Learning Image Classification

Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for Downstream Tasks

no code implementations27 Jan 2023 Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

To address these challenges, we create a small model for a new task from the pruned models of similar tasks.

How Many Demonstrations Do You Need for In-context Learning?

no code implementations14 Mar 2023 Jiuhai Chen, Lichang Chen, Chen Zhu, Tianyi Zhou

Moreover, ICL (with and w/o CoT) using only one correct demo significantly outperforms all-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding correct demo(s) for input queries, which is difficult to evaluate on the biased datasets.

In-Context Learning

Solving Regularized Exp, Cosh and Sinh Regression Problems

no code implementations28 Mar 2023 Zhihang Li, Zhao Song, Tianyi Zhou

In this paper, we make use of the input sparsity and purpose an algorithm that use $\log ( \|x_0 - x^*\|_2 / \epsilon)$ iterations and $\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} )$ per iteration time to solve the problem.

regression

When do you need Chain-of-Thought Prompting for ChatGPT?

no code implementations6 Apr 2023 Jiuhai Chen, Lichang Chen, Heng Huang, Tianyi Zhou

However, it is not clear whether CoT is still effective on more recent instruction finetuned (IFT) LLMs such as ChatGPT.

Arithmetic Reasoning Memorization

Does Continual Learning Equally Forget All Parameters?

no code implementations9 Apr 2023 Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

In this paper, we study which modules in neural networks are more prone to forgetting by investigating their training dynamics during CL.

Attribute Continual Learning

The Closeness of In-Context Learning and Weight Shifting for Softmax Regression

no code implementations26 Apr 2023 Shuai Li, Zhao Song, Yu Xia, Tong Yu, Tianyi Zhou

Large language models (LLMs) are known for their exceptional performance in natural language processing, making them highly effective in many human life-related or even job-related tasks.

In-Context Learning regression

Large Language Models are Strong Zero-Shot Retriever

no code implementations27 Apr 2023 Tao Shen, Guodong Long, Xiubo Geng, Chongyang Tao, Tianyi Zhou, Daxin Jiang

In this work, we propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios.

Language Modelling Large Language Model +1

Graph-guided Personalization for Federated Recommendation

no code implementations13 May 2023 Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijjian Zhang, Bo Yang

Federated Recommendation is a new service architecture providing recommendations without sharing user data with the server.

When Federated Recommendation Meets Cold-Start Problem: Separating Item Attributes and User Interactions

no code implementations22 May 2023 Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijian Zhang, Peng Yan, Bo Yang

However, this separation of the recommendation model and users' private data poses a challenge in providing quality service, particularly when it comes to new items, namely cold-start recommendations in federated settings.

Attribute Federated Learning +1

Spatial-temporal Prompt Learning for Federated Weather Forecasting

no code implementations23 May 2023 Shengchao Chen, Guodong Long, Tao Shen, Tianyi Zhou, Jing Jiang

Federated weather forecasting is a promising collaborative learning framework for analyzing meteorological data across participants from different countries and regions, thus embodying a global-scale real-time weather data predictive analytics platform to tackle climate change.

Time Series Weather Forecasting

Learning UI-to-Code Reverse Generator Using Visual Critic Without Rendering

no code implementations24 May 2023 Davit Soselia, Khalid Saifullah, Tianyi Zhou

We evaluate the UI-to-Code performance using a combination of automated metrics such as MSE, BLEU, IoU, and a novel htmlBLEU score.

Code Generation reinforcement-learning

Condensed Prototype Replay for Class Incremental Learning

no code implementations25 May 2023 Jiangtao Kong, Zhenyu Zong, Tianyi Zhou, Huajie Shao

In this paper, we propose YONO that You Only Need to replay One condensed prototype per class, which for the first time can even outperform memory-costly exemplar-replay methods.

Class Incremental Learning Incremental Learning

Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation

no code implementations9 Jun 2023 Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha

Trajectory length stands as a crucial hyperparameter within reinforcement learning (RL) algorithms, significantly contributing to the sample inefficiency in robotics applications.

Policy Gradient Methods reinforcement-learning +1

Taming Small-sample Bias in Low-budget Active Learning

no code implementations19 Jun 2023 Linxin Song, Jieyu Zhang, Xiaotian Lu, Tianyi Zhou

Instead of tuning the coefficient for each query round, which is sensitive and time-consuming, we propose the curriculum Firth bias reduction (CHAIN) that can automatically adjust the coefficient to be adaptive to the training process.

Active Learning

Eigensubspace of Temporal-Difference Dynamics and How It Improves Value Approximation in Reinforcement Learning

no code implementations29 Jun 2023 Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi

In ERC, we propose a regularizer that guides the approximation error tending towards the 1-eigensubspace, resulting in a more efficient and stable path of value approximation.

Reinforcement Learning (RL)

MerA: Merging Pretrained Adapters For Few-Shot Learning

no code implementations30 Aug 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.

Few-Shot Learning MRPC

Curriculum Reinforcement Learning via Morphology-Environment Co-Evolution

no code implementations21 Sep 2023 Shuang Ao, Tianyi Zhou, Guodong Long, Xuan Song, Jing Jiang

Throughout long history, natural species have learned to survive by evolving their physical structures adaptive to the environment changes.

reinforcement-learning Reinforcement Learning (RL)

Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention

no code implementations18 Oct 2023 Yichuan Deng, Zhao Song, Tianyi Zhou

Large transformer models have achieved state-of-the-art results in numerous natural language processing tasks.

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

no code implementations19 Nov 2023 Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou

In this paper, we consider a heavy inner product identification problem, which generalizes the Light Bulb problem~(\cite{prr89}): Given two sets $A \subset \{-1,+1\}^d$ and $B \subset \{-1,+1\}^d$ with $|A|=|B| = n$, if there are exact $k$ pairs whose inner product passes a certain threshold, i. e., $\{(a_1, b_1), \cdots, (a_k, b_k)\} \subset A \times B$ such that $\forall i \in [k], \langle a_i, b_i \rangle \geq \rho \cdot d$, for a threshold $\rho \in (0, 1)$, the goal is to identify those $k$ heavy inner products.

ODIN: Disentangled Reward Mitigates Hacking in RLHF

no code implementations11 Feb 2024 Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro

In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs.

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic

no code implementations12 Feb 2024 Jiuxiang Gu, Chenyang Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Tianyi Zhou

Our research presents a thorough analytical characterization of the features learned by stylized one-hidden layer neural networks and one-layer Transformers in addressing this task.

Mathematical Reasoning

Meta-Task Prompting Elicits Embedding from Large Language Models

no code implementations28 Feb 2024 Yibin Lei, Di wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates

In this work, we introduce a new unsupervised embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from Large Language Models (LLMs) without the need for model fine-tuning or task-specific engineering.

Semantic Textual Similarity Sentence +2

Corpus-Steered Query Expansion with Large Language Models

1 code implementation28 Feb 2024 Yibin Lei, Yu Cao, Tianyi Zhou, Tao Shen, Andrew Yates

Recent studies demonstrate that query expansions generated by large language models (LLMs) can considerably enhance information retrieval systems by generating hypothetical documents that answer the queries as expansions.

Information Retrieval Retrieval

Many-Objective Multi-Solution Transport

no code implementations6 Mar 2024 Ziyue Li, Tian Li, Virginia Smith, Jeff Bilmes, Tianyi Zhou

Optimizing the performance of many objectives (instantiated by tasks or clients) jointly with a few Pareto stationary solutions (models) is critical in machine learning.

Federated Learning Multi-Task Learning

Simplicity Bias of Transformers to Learn Low Sensitivity Functions

no code implementations11 Mar 2024 Bhavya Vasudeva, Deqing Fu, Tianyi Zhou, Elliott Kau, Youqi Huang, Vatsal Sharan

Transformers achieve state-of-the-art accuracy and robustness across many tasks, but an understanding of the inductive biases that they have and how those biases are different from other neural network architectures remains elusive.

Cannot find the paper you are looking for? You can Submit a new open access paper.