1 code implementation • 12 Oct 2023 • Xiangyan Liu, Rongxue Li, Wei Ji, Tao Lin
The reasoning capabilities of LLM (Large Language Model) are widely acknowledged in recent research, inspiring studies on tool learning and autonomous agents.
no code implementations • 9 Oct 2023 • Yongxin Guo, Xiaoying Tang, Tao Lin
To this end, this paper presents a comprehensive investigation into current clustered FL methods and proposes a four-tier framework, namely HCFL, to encompass and extend existing approaches.
no code implementations • 16 Jul 2023 • Haobo Song, Soumajit Majumder, Tao Lin
Implicit models such as Deep Equilibrium Models (DEQs) have garnered significant attention in the community for their ability to train infinite layer models with elegant solution-finding procedures and constant memory footprint.
1 code implementation • 6 Jun 2023 • Hao Zhao, Yuejiang Liu, Alexandre Alahi, Tao Lin
Test-Time Adaptation (TTA) has recently emerged as a promising approach for tackling the robustness challenge under distribution shifts.
no code implementations • 1 Jun 2023 • Jiamian Wang, Zongliang Wu, Yulun Zhang, Xin Yuan, Tao Lin, Zhiqiang Tao
In this work, we tackle this challenge by marrying prompt tuning with FL to snapshot compressive imaging for the first time and propose an federated hardware-prompt learning (FedHP) method.
no code implementations • 18 May 2023 • Yichen Zhu, Jian Yuan, Bo Jiang, Tao Lin, Haiming Jin, Xinbing Wang, Chenghu Zhou
We focus on the case where the underlying joint distribution of complete features and label is invariant, but the missing pattern, i. e., mask distribution may shift agnostically between training and testing.
1 code implementation • ICCV 2023 • Zexi Li, Xinyi Shang, Rui He, Tao Lin, Chao Wu
Recent advances in neural collapse have shown that the classifiers and feature prototypes under perfect training scenarios collapse into an optimal structure called simplex equiangular tight frame (ETF).
1 code implementation • 14 Feb 2023 • Zexi Li, Tao Lin, Xinyi Shang, Chao Wu
In federated learning (FL), weighted aggregation of local models is conducted to generate a global model, and the aggregation weights are normalized (the sum of weights is 1) and proportional to the local data sizes.
no code implementations • 7 Feb 2023 • YiLing Chen, Tao Lin
We show that, under natural assumptions, (1) the sender can find a signaling scheme that guarantees itself an expected utility almost as good as its optimal utility in the classic model, no matter what approximately best-responding strategy the receiver uses; (2) on the other hand, there is no signaling scheme that gives the sender much more utility than its optimal utility in the classic model, even if the receiver uses the approximately best-responding strategy that is best for the sender.
no code implementations • 29 Jan 2023 • Yongxin Guo, Xiaoying Tang, Tao Lin
In this paper, we identify the learning challenges posed by the simultaneous occurrence of diverse distribution shifts and propose a clustering principle to overcome these challenges.
no code implementations • 3 Jan 2023 • Yue Liu, Tao Lin, Anastasia Koloskova, Sebastian U. Stich
Gradient tracking (GT) is an algorithm designed for solving decentralized optimization problems over a network (such as training a machine learning model).
no code implementations • 13 Dec 2022 • Tao Lin
While many classical notions of learnability (e. g., PAC learnability) are distribution-free, utilizing the specific structures of an input distribution may improve learning performance.
1 code implementation • 26 May 2022 • Yongxin Guo, Xiaoying Tang, Tao Lin
As a remedy, we propose FedBR, a novel unified algorithm that reduces the local learning bias on features and classifiers to tackle these challenges.
1 code implementation • 22 May 2022 • Liangze Jiang, Tao Lin
Personalized FL additionally adapts the global model to different clients, achieving promising results on consistent local training and test distributions.
no code implementations • 3 May 2022 • Daniel M. Ziegler, Seraphina Nix, Lawrence Chan, Tim Bauman, Peter Schmidt-Nielsen, Tao Lin, Adam Scherlis, Noa Nabeshima, Ben Weinstein-Raun, Daniel de Haas, Buck Shlegeris, Nate Thomas
We found that adversarial training increased robustness to the adversarial attacks that we trained on -- doubling the time for our contractors to find adversarial examples both with our tool (from 13 to 26 minutes) and without (from 20 to 44 minutes) -- without affecting in-distribution performance.
1 code implementation • 15 Feb 2022 • Jie Su, Zhenyu Wen, Tao Lin, Yu Guan
To address this issue, in this work, we proposed a Behaviour Pattern Disentanglement (BPD) framework, which can disentangle the behavior patterns from the irrelevant noises such as personal styles or environmental noises, etc.
no code implementations • NeurIPS 2021 • Anastasia Koloskova, Tao Lin, Sebastian U. Stich
We consider decentralized machine learning over a network where the training data is distributed across $n$ agents, each of which can compute stochastic model updates on their local data.
no code implementations • 25 Dec 2021 • Yongxin Guo, Tao Lin, Xiaoying Tang
Federated Learning (FL) is a learning paradigm that protects privacy by keeping client data on edge devices.
no code implementations • 21 Nov 2021 • Jian Peng, Xian Sun, Min Deng, Chao Tao, Bo Tang, Wenbo Li, Guohua Wu, QingZhu, Yu Liu, Tao Lin, Haifeng Li
This paper presents a learning model by active forgetting mechanism with artificial neural networks.
1 code implementation • 8 Oct 2021 • Xiaotie Deng, Xinyan Hu, Tao Lin, Weiqiang Zheng
Specifically, the results depend on the number of bidders with the highest value: - If the number is at least three, the bidding dynamics almost surely converges to a Nash equilibrium of the auction, both in time-average and in last-iterate.
1 code implementation • NeurIPS 2021 • Thijs Vogels, Lie He, Anastasia Koloskova, Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi
A key challenge, primarily in decentralized deep learning, remains the handling of differences between the workers' local data distributions.
no code implementations • 5 Oct 2021 • Yichen Zhu, Bo Jiang, Haiming Jin, Mengtian Zhang, Feng Gao, Jianqiang Huang, Tao Lin, Xinbing Wang
An important task in such applications is to predict the future values of a NETS based on its historical values and the underlying graph.
no code implementations • 28 Aug 2021 • Fei Mi, Tao Lin, Boi Faltings
In this paper, we consider scenarios that require learning new classes or data distributions quickly and incrementally over time, as it often occurs in real-world dynamic environments.
no code implementations • 18 Aug 2021 • Haoran Peng, He Huang, Li Xu, Tianjiao Li, Jun Liu, Hossein Rahmani, Qiuhong Ke, Zhicheng Guo, Cong Wu, Rongchang Li, Mang Ye, Jiahao Wang, Jiaxu Zhang, Yuanzhong Liu, Tao He, Fuwei Zhang, Xianbin Liu, Tao Lin
In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021.
no code implementations • 2 Jul 2021 • Manon Revel, Tao Lin, Daniel Halpern
We analyze the optimal size of a congress in a representative democracy.
no code implementations • 12 Apr 2021 • Tao Lin
Besides, this paper presents a research on data retrieval solution to avoid hacking by adversaries in the fields of adversary machine leaning.
no code implementations • 9 Feb 2021 • Lingjing Kong, Tao Lin, Anastasia Koloskova, Martin Jaggi, Sebastian U. Stich
Decentralized training of deep learning models enables on-device learning over networks, as well as efficient scaling to large compute clusters.
1 code implementation • 9 Feb 2021 • Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi
In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity.
no code implementations • 1 Jan 2021 • Tao Lin, Lingjing Kong, Anastasia Koloskova, Martin Jaggi, Sebastian U Stich
Decentralized training of deep learning models enables on-device learning over networks, as well as efficient scaling to large compute clusters.
no code implementations • 22 Nov 2020 • Jiawei Zhu, Chao Tao, Hanhan Deng, Ling Zhao, Pu Wang, Tao Lin, Haifeng Li
Traffic forecasting is a fundamental and challenging task in the field of intelligent transportation.
no code implementations • NeurIPS 2020 • Xiaotie Deng, Ron Lavi, Tao Lin, Qi Qi, Wenwei Wang, Xiang Yan
The Empirical Revenue Maximization (ERM) is one of the most important price learning algorithms in auction design: as the literature shows it can learn approximately optimal reserve prices for revenue-maximizing auctioneers in both repeated auctions and uniform-price auctions.
2 code implementations • 10 Sep 2020 • Kevin Fauvel, Tao Lin, Véronique Masson, Élisa Fromont, Alexandre Termier
Then, we illustrate how XCM reconciles performance and explainability on a synthetic dataset and show that XCM enables a more precise identification of the regions of the input data that are important for predictions compared to the current deep learning MTS classifier also providing faithful explainability.
no code implementations • NeurIPS 2020 • Hu Fu, Tao Lin
In non-truthful auctions, agents' utility for a strategy depends on the strategies of the opponents and also the prior distribution over their private types; the set of Bayes Nash equilibria generally has an intricate dependence on the prior.
1 code implementation • NeurIPS 2020 • Chen Liu, Mathieu Salzmann, Tao Lin, Ryota Tomioka, Sabine Süsstrunk
We analyze the influence of adversarial training on the loss landscape of machine learning models.
no code implementations • ICLR 2020 • Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi
Deep neural networks often have millions of parameters.
1 code implementation • NeurIPS 2020 • Tao Lin, Lingjing Kong, Sebastian U. Stich, Martin Jaggi
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
no code implementations • ICML 2020 • Tao Lin, Lingjing Kong, Sebastian U. Stich, Martin Jaggi
Deep learning networks are typically trained by Stochastic Gradient Descent (SGD) methods that iteratively improve the model parameters by estimating a gradient on a very small fraction of the training data.
no code implementations • EMNLP 2020 • Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning.
no code implementations • 18 Jan 2020 • Yuhui Zhao, Ning Yang, Tao Lin, Philip S. Yu
First, the existing works often assume an underlying information diffusion model, which is impractical in real world due to the complexity of information diffusion.
1 code implementation • 19 Dec 2019 • Jian Peng, Bo Tang, Hao Jiang, Zhuo Li, Yinjie Lei, Tao Lin, Haifeng Li
It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference.
1 code implementation • ICLR 2020 • Anastasia Koloskova, Tao Lin, Sebastian U. Stich, Martin Jaggi
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks, as well as for efficient scaling to large compute clusters.
3 code implementations • 28 May 2019 • Tian Guo, Tao Lin, Nino Antulov-Fantulin
In this paper, we explore the structure of LSTM recurrent neural networks to learn variable-wise hidden states, with the aim to capture different dynamics in multi-variable time series and distinguish the contribution of variables to the prediction.
9 code implementations • 12 Nov 2018 • Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, Haifeng Li
However, traffic forecasting has always been considered an open scientific issue, owing to the constraints of urban road network topological structure and the law of dynamic change with time, namely, spatial dependence and temporal dependence.
no code implementations • 27 Sep 2018 • Tian Guo, Tao Lin
In learning a predictive model over multivariate time series consisting of target and exogenous variables, the forecasting performance and interpretability of the model are both essential for deployment and uncovering knowledge behind the data.
2 code implementations • ICLR 2020 • Tao Lin, Sebastian U. Stich, Kumar Kshitij Patel, Martin Jaggi
Mini-batch stochastic gradient methods (SGD) are state of the art for distributed training of deep neural networks.
no code implementations • 17 Jun 2018 • Tian Guo, Tao Lin
In this paper, we propose multi-variable LSTM capable of accurate forecasting and variable importance interpretation for time series with exogenous variables.
no code implementations • 14 Apr 2018 • Tian Guo, Tao Lin, Yao Lu
In this paper, we propose an interpretable LSTM recurrent neural network, i. e., multi-variable LSTM for time series with exogenous variables.
no code implementations • NeurIPS 2018 • Mario Drumond, Tao Lin, Martin Jaggi, Babak Falsafi
We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic.
no code implementations • 8 Nov 2017 • Huiting Liu, Tao Lin, Hanfei Sun, Weijian Lin, Chih-Wei Chang, Teng Zhong, Alexander Rudnicky
RubyStar is a dialog system designed to create "human-like" conversation by combining different response generation strategies.