no code implementations • ICML 2020 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes
In this paper, we study the dynamics of neural net outputs in SSL and show that selecting and using first the unlabeled samples with more consistent outputs over the course of training (i. e., "time-consistency") can improve the final test accuracy and save computation.
no code implementations • 15 Mar 2023 • Divya Kothandaraman, Tianyi Zhou, Ming Lin, Dinesh Manocha
Aerial Diffusion leverages a pretrained text-image diffusion model for prior knowledge.
no code implementations • 14 Mar 2023 • Jiuhai Chen, Lichang Chen, Tianyi Zhou
Moreover, ICL (with and w/o CoT) using only one correct demo significantly outperforms all-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding correct demo(s) for input queries, which is difficult to evaluate on the biased datasets.
no code implementations • 27 Jan 2023 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
To address these challenges, we create a small model for a new task from the pruned models of similar tasks.
no code implementations • 22 Jan 2023 • Zhiwei Li, Guodong Long, Tianyi Zhou
Moreover, a curriculum learning mechanism has been applied for additive personalization on item embeddings by gradually increasing regularization weights to mitigate the performance degradation caused by large variances among client-specific item embeddings.
no code implementations • 16 Jan 2023 • Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijian Zhang, Chengqi Zhang, Bo Yang
Moreover, we provide visualizations and in-depth analysis of the personalization techniques in item embedding, which shed novel insights on the design of RecSys in federated settings.
1 code implementation • 2 Nov 2022 • Kaiwen Yang, Yanchao Sun, Jiahao Su, Fengxiang He, Xinmei Tian, Furong Huang, Tianyi Zhou, DaCheng Tao
In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with strong pre-defined augmentations, e. g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly.
1 code implementation • 27 Oct 2022 • Yu Cao, Dianqi Li, Meng Fang, Tianyi Zhou, Jun Gao, Yibing Zhan, DaCheng Tao
We present Twin Answer Sentences Attack (TASA), an adversarial attack method for question answering (QA) models that produces fluent and grammatical adversarial contexts while maintaining gold answers.
no code implementations • 21 Sep 2022 • Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, Jing Jiang
To prevent these issues from hindering the deployment of FL systems, we propose a lightweight framework where clients jointly learn to fuse the representations generated by multiple fixed pre-trained models rather than training a large-scale model from scratch.
1 code implementation • Findings (NAACL) 2022 • Yibin Lei, Yu Cao, Dianqi Li, Tianyi Zhou, Meng Fang, Mykola Pechenizkiy
Generating high-quality textual adversarial examples is critical for investigating the pitfalls of natural language processing (NLP) models and further promoting their robustness.
no code implementations • 20 May 2022 • Zhuowei Wang, Tianyi Zhou, Guodong Long, Bo Han, Jing Jiang
Federated learning (FL) aims at training a global model on the server side while the training data are collected and located at the local devices.
no code implementations • ACL 2022 • Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou
Transformer-based models generally allocate the same amount of computation for each token in a given sequence.
1 code implementation • 2 Mar 2022 • Fengwen Chen, Guodong Long, Zonghan Wu, Tianyi Zhou, Jing Jiang
We propose a novel structured federated learning (SFL) framework to learn both the global and personalized models simultaneously using client-wise relation graphs and clients' private data.
no code implementations • 13 Feb 2022 • Jie Ma, Guodong Long, Tianyi Zhou, Jing Jiang, Chengqi Zhang
Knowledge sharing and model personalization are essential components to tackle the non-IID challenge in federated learning (FL).
no code implementations • CVPR 2022 • Shuangtong Li, Tianyi Zhou, Xinmei Tian, DaCheng Tao
Decentralized learning (DL) can exploit the images distributed over devices on a network topology to train a global model but is not designed to train personalized models for different tasks or optimize the topology.
no code implementations • NeurIPS 2021 • Shengjie Wang, Tianyi Zhou, Chandrashekhar Lavania, Jeff A. Bilmes
Robust submodular partitioning promotes the diversity of every block in the partition.
1 code implementation • NeurIPS 2021 • Shuang Ao, Tianyi Zhou, Guodong Long, Qinghua Lu, Liming Zhu, Jing Jiang
Next, a bottom-up traversal of the tree trains the RL agent from easier sub-tasks with denser rewards on bottom layers to harder ones on top layers and collects its cost on each sub-task train the planner in the next episode.
no code implementations • NeurIPS 2021 • Kaiwen Yang, Tianyi Zhou, Yonggang Zhang, Xinmei Tian, DaCheng Tao
In this paper, we propose ''class-disentanglement'' that trains a variational autoencoder $G(\cdot)$ to extract this class-dependent information as $x - G(x)$ via a trade-off between reconstructing $x$ by $G(x)$ and classifying $x$ by $D(x-G(x))$, where the former competes with the latter in decomposing $x$ so the latter retains only necessary information for classification in $x-G(x)$.
no code implementations • 29 Sep 2021 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Liming Zhu, Chengqi Zhang
Can we find a better initialization for a new task, e. g., a much smaller network closer to the final pruned model, by exploiting its similar tasks?
no code implementations • ICLR 2022 • Ravikumar Balakrishnan, Tian Li, Tianyi Zhou, Nageen Himayat, Virginia Smith, Jeff Bilmes
In every communication round of federated learning, a random subset of clients communicate their model updates back to the server which then aggregates them all.
no code implementations • 29 Sep 2021 • Kaiwen Yang, Tianyi Zhou, Xinmei Tian, DaCheng Tao
We then adversarially perturb $G(x)$ in the VAE's bottleneck space and adds it back to the original $R(x)$ as an augmentation, which is therefore sufficiently challenging for contrastive learning and meanwhile preserves the sample identity intact.
no code implementations • 29 Sep 2021 • Shuang Ao, Tianyi Zhou, Jing Jiang, Guodong Long, Xuan Song, Chengqi Zhang
They are complementary in acquiring more informative feedback for RL: the planning policy provides dense reward of finishing easier sub-tasks while the environment policy modifies these sub-tasks to be adequately challenging and diverse so the RL agent can quickly adapt to different tasks/environments.
no code implementations • ICLR 2022 • Yijun Yang, Jing Jiang, Tianyi Zhou, Jie Ma, Yuhui Shi
Model-based offline RL instead trains an environment model using a dataset of pre-collected experiences so online RL methods can learn in an offline manner by solely interacting with the model.
1 code implementation • Findings (EMNLP) 2021 • Bo wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
Aspect-level sentiment classification (ALSC) aims at identifying the sentiment polarity of a specified aspect in a sentence.
Aspect-Based Sentiment Analysis (ABSA)
Representation Learning
+1
1 code implementation • 19 Aug 2021 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang
By comparison, a mixture of multiple global models could capture the heterogeneity across various clients if assigning the client to different global models (i. e., centers) in FL.
1 code implementation • ICLR 2021 • Yuchen Jin, Tianyi Zhou, Liangyu Zhao, Yibo Zhu, Chuanxiong Guo, Marco Canini, Arvind Krishnamurthy
This mutual-training process between BO and the loss-prediction model allows us to limit the training steps invested in the BO search.
2 code implementations • 1 May 2021 • Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, Chengqi Zhang
Heterogeneity across clients in federated learning (FL) usually hinders the optimization convergence and generalization performance when the aggregation of clients' knowledge occurs in the gradient space.
no code implementations • ICLR 2021 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang
To resolve this problem, we propose Isometric Propagation Network (IPN), which learns to strengthen the relation between classes within each space and align the class dependency in the two spaces.
no code implementations • ICLR 2021 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes
Neural nets training can easily overfit to noisy labels and end with poor generalization performance.
no code implementations • 1 Jan 2021 • Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
In this paper, we introduce an efficient method, \name, to extract the local inference chains by optimizing a differentiable sparse scoring for the filters and layers to preserve the outputs on given data from a local region.
no code implementations • 1 Jan 2021 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Xuanyi Dong, Chengqi Zhang
Few-shot learning aims to train a classifier given only a few samples per class that are highly insufficient to describe the whole data distribution.
no code implementations • NeurIPS 2020 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
Compared to existing CL methods: (1) DIH is more stable over time than using only instantaneous hardness, which is noisy due to stochastic training and DNN's non-smoothness; (2) DIHCL is computationally inexpensive since it uses only a byproduct of back-propagation and thus does not require extra inference.
2 code implementations • COLING 2020 • Yang Li, Tao Shen, Guodong Long, Jing Jiang, Tianyi Zhou, Chengqi Zhang
Then, facilitated by the proposed base model, we introduce collaborating relation features shared among relations in the hierarchies to promote the relation-augmenting process and balance the training data for long-tail relations.
no code implementations • 24 Sep 2020 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a pre-defined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes.
1 code implementation • 28 Jun 2020 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
We study many-class few-shot (MCFS) problem in both supervised learning and meta-learning settings.
3 code implementations • 3 May 2020 • Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang
However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i. e., centers) can better capture the heterogeneity of data distributions across users.
1 code implementation • 30 Apr 2020 • Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.
Ranked #4 on
Link Prediction
on UMLS
3 code implementations • ICLR 2022 • Wensi Tang, Guodong Long, Lu Liu, Tianyi Zhou, Michael Blumenstein, Jing Jiang
Particularly, it is a set of kernel sizes that can efficiently cover the best RF size across different datasets via consisting of multiple prime numbers according to the length of the time series.
no code implementations • 18 Feb 2020 • Yujia Xie, Tianyi Zhou, Yi Mao, Weizhu Chen
Thereby, the contextual dependencies modeled by CSA will be highly relevant to the query.
no code implementations • 12 Feb 2020 • Chi Zhang, Yong Sheng Soh, Ling Feng, Tianyi Zhou, Qianxiao Li
While current machine learning models have impressive performance over a wide range of applications, their large size and complexity render them unsuitable for tasks such as remote monitoring on edge devices with limited storage and computational power.
1 code implementation • NeurIPS 2019 • Meng Fang, Tianyi Zhou, Yali Du, Lei Han, Zhengyou Zhang
This ``Goal-and-Curiosity-driven Curriculum Learning'' leads to ``Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection.
no code implementations • 27 Nov 2019 • Yang Li, Guodong Long, Tao Shen, Tianyi Zhou, Lina Yao, Huan Huo, Jing Jiang
Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision.
no code implementations • 25 Sep 2019 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
The advantages of DIHCL, compared to other curriculum learning approaches, are: (1) DIHCL does not require additional inference steps over the data not selected by DIHCL in each epoch, (2) the dynamic instance hardness, compared to static instance hardness (e. g., instantaneous loss), is more stable as it integrates information over the entire training history up to the present time.
1 code implementation • NeurIPS 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
It can significantly improve tasks that suffer from insufficient training data, e. g., few shot learning.
2 code implementations • 10 May 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang
The resulting graph of prototypes can be continually re-used and updated for new tasks and classes.
no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes
In particular, we study how to attribute a DNN's bias to its input features.
no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes
In this paper, we discuss three novel observations about dropout to better understand the generalization of DNNs with rectified linear unit (ReLU) activations: 1) dropout is a smoothing technique that encourages each local linear model of a DNN to be trained on data points from nearby regions; 2) a constant dropout rate can result in effective neural-deactivation rates that are significantly different for layers with different fractions of activated neurons; and 3) the rescaling factor of dropout causes an inconsistency to occur between the normalization during training and testing conditions when batch normalization is also used.
no code implementations • ICLR 2019 • Lu Liu, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
It addresses the ``many-class'' problem by exploring the class hierarchy, e. g., the coarse-class label that covers a subset of fine classes, which helps to narrow down the candidates for the fine class and is cheaper to obtain.
no code implementations • NeurIPS 2018 • Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
We study a new method (``Diverse Ensemble Evolution (DivE$^2$)'') to train an ensemble of machine learning models that assigns data to models at each training epoch based on each model's current expertise and an intra- and inter-model diversity reward.
2 code implementations • NAACL 2019 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
Neural networks equipped with self-attention have parallelizable computation, light-weight structure, and the ability to capture both long-range and local dependencies.
1 code implementation • ICLR 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang
In this paper, we propose a model, called "bi-directional block self-attention network (Bi-BloSAN)", for RNN/CNN-free sequence encoding.
1 code implementation • 31 Jan 2018 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang
In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other.
Ranked #56 on
Natural Language Inference
on SNLI
no code implementations • ICLR 2018 • Tianyi Zhou, Jeff Bilmes
We introduce and study minimax curriculum learning (MCL), a new method for adaptively selecting a sequence of training subsets for a succession of stages in machine learning.
1 code implementation • 14 Sep 2017 • Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, Chengqi Zhang
Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively.
Ranked #69 on
Natural Language Inference
on SNLI
no code implementations • 1 Jun 2016 • Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin
We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.
no code implementations • 1 Jun 2016 • Tianyi Zhou, Jeff Bilmes
We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice.
no code implementations • NeurIPS 2014 • Tianyi Zhou, Jeff Bilmes, Carlos Guestrin
We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.
no code implementations • 2 Sep 2013 • Tianyi Zhou, DaCheng Tao
Learning big data by matrix decomposition always suffers from expensive computation, mixing of complicated structures and noise.