Search Results for author: Kenji Kawaguchi

Found 47 papers, 14 papers with code

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

1 code implementation1 Apr 2022 Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Kenji Kawaguchi, Ankit Vani, Aaron Courville

Specifically, we show that the temperature $\tau$ of the Softmax operation controls for the SEM representation's expressivity, allowing us to derive a tighter downstream classifier generalization bound than that for classifiers using unnormalized representations.

Classification Self-Supervised Learning

EIGNN: Efficient Infinite-Depth Graph Neural Networks

1 code implementation NeurIPS 2021 Juncheng Liu, Kenji Kawaguchi, Bryan Hooi, Yiwei Wang, Xiaokui Xiao

Motivated by this limitation, we propose a GNN model with infinite depth, which we call Efficient Infinite-Depth Graph Neural Networks (EIGNN), to efficiently capture very long-range dependencies.

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

no code implementations2 Feb 2022 Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit.

Quantization reinforcement-learning +1

Multi-Task Learning as a Bargaining Game

1 code implementation2 Feb 2022 Aviv Navon, Aviv Shamsian, Idan Achituve, Haggai Maron, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya

In this paper, we propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.

Multi-Task Learning

ExpertNet: A Symbiosis of Classification and Clustering

no code implementations17 Jan 2022 Shivin Srivastava, Kenji Kawaguchi, Vaibhav Rajan

We theoretically analyze the effect of clustering on its generalization gap, and empirically show that clustered latent representations from ExpertNet lead to disentangling the intrinsic structure and improvement in classification performance.

Classification Representation Learning

Training Free Graph Neural Networks for Graph Matching

1 code implementation14 Jan 2022 Zhiyuan Liu, Yixin Cao, Fuli Feng, Xiang Wang, Jie Tang, Kenji Kawaguchi, Tat-Seng Chua

We present a framework of Training Free Graph Matching (TFGM) to boost the performance of Graph Neural Networks (GNNs) based graph matching, providing a fast promising solution without training (training-free).

Entity Alignment Graph Matching +1

Noether Networks: Meta-Learning Useful Conserved Quantities

no code implementations NeurIPS 2021 Ferran Alet, Dylan Doblar, Allan Zhou, Joshua Tenenbaum, Kenji Kawaguchi, Chelsea Finn

Progress in machine learning (ML) stems from a combination of data availability, computational resources, and an appropriate encoding of inductive biases.

Meta-Learning Translation

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

no code implementations NeurIPS 2021 Clement Gehring, Kenji Kawaguchi, Jiaoyang Huang, Leslie Kaelbling

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting.

Model-based Reinforcement Learning reinforcement-learning

Combined Scaling for Open-Vocabulary Image Classification

no code implementations19 Nov 2021 Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Ranked #2 on Zero-Shot Transfer Image Classification on ImageNet (using extra training data)

Classification Contrastive Learning +3

When Do Extended Physics-Informed Neural Networks (XPINNs) Improve Generalization?

no code implementations20 Sep 2021 Zheyuan Hu, Ameya D. Jagtap, George Em Karniadakis, Kenji Kawaguchi

Specifically, for general multi-layer PINNs and XPINNs, we first provide a prior generalization bound via the complexity of the target functions in the PDE problem, and a posterior generalization bound via the posterior matrix norms of the networks after optimization.

Meta-learning PINN loss functions

no code implementations12 Jul 2021 Apostolos F Psaros, Kenji Kawaguchi, George Em Karniadakis

In the computational examples, the meta-learned losses are employed at test time for addressing regression and PDE task distributions.


Discrete-Valued Neural Communication

no code implementations NeurIPS 2021 Dianbo Liu, Alex Lamb, Kenji Kawaguchi, Anirudh Goyal, Chen Sun, Michael Curtis Mozer, Yoshua Bengio

Deep learning has advanced from fully connected architectures to structured models organized into components, e. g., the transformer composed of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes.

Quantization Systematic Generalization

Understanding Dynamics of Nonlinear Representation Learning and Its Application

no code implementations28 Jun 2021 Kenji Kawaguchi, Linjun Zhang, Zhun Deng

Representation learning allows us to automatically discover suitable representations from raw sensory data.

Representation Learning

Adversarial Training Helps Transfer Learning via Better Representations

no code implementations NeurIPS 2021 Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Zou

Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.

Transfer Learning

MemStream: Memory-Based Streaming Anomaly Detection

1 code implementation7 Jun 2021 Siddharth Bhatia, Arjit Jain, Shivin Srivastava, Kenji Kawaguchi, Bryan Hooi

Given a stream of entries over time in a multi-dimensional data setting where concept drift is present, how can we detect anomalous activities?

Denoising Unsupervised Anomaly Detection

Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions

2 code implementations20 May 2021 Ameya D. Jagtap, Yeonjong Shin, Kenji Kawaguchi, George Em Karniadakis

We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions.

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

no code implementations10 May 2021 Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, Kenji Kawaguchi

Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution.

A Recipe for Global Convergence Guarantee in Deep Neural Networks

no code implementations12 Apr 2021 Kenji Kawaguchi, Qingyun Sun

Existing global convergence guarantees of (stochastic) gradient descent do not apply to practical deep networks in the practical regime of deep learning beyond the neural tangent kernel (NTK) regime.

Dont Just Divide; Polarize and Conquer!

1 code implementation23 Feb 2021 Shivin Srivastava, Siddharth Bhatia, Lingxiao Huang, Lim Jun Heng, Kenji Kawaguchi, Vaibhav Rajan

In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier.

Classification General Classification

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

no code implementations15 Feb 2021 Kenji Kawaguchi

In this paper, we analyze the gradient dynamics of deep equilibrium models with nonlinearity only on weight matrices and non-convex objective functions of weights for regression and classification.

When and How Mixup Improves Calibration

no code implementations11 Feb 2021 Linjun Zhang, Zhun Deng, Kenji Kawaguchi, James Zou

In addition, we study how Mixup improves calibration in semi-supervised learning.

Data Augmentation

Dynamics of Deep Equilibrium Linear Models

no code implementations ICLR 2021 Kenji Kawaguchi

A deep equilibrium linear model is implicitly defined through an equilibrium point of an infinite sequence of computation.

Towards Domain-Agnostic Contrastive Learning

no code implementations9 Nov 2020 Vikas Verma, Minh-Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc V. Le

Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation.

Contrastive Learning Data Augmentation +3

How Does Mixup Help With Robustness and Generalization?

no code implementations ICLR 2021 Linjun Zhang, Zhun Deng, Kenji Kawaguchi, Amirata Ghorbani, James Zou

For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss.

Data Augmentation

GraphMix: Improved Training of GNNs for Semi-Supervised Learning

1 code implementation25 Sep 2019 Vikas Verma, Meng Qu, Kenji Kawaguchi, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang

We present GraphMix, a regularization method for Graph Neural Network based semi-supervised object classification, whereby we propose to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization.

Generalization Bounds Graph Attention +1

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

no code implementations5 Aug 2019 Kenji Kawaguchi, Jiaoyang Huang

The theory developed in this paper only requires the practical degrees of over-parameterization unlike previous theories.

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

2 code implementations9 Jul 2019 Kenji Kawaguchi, Haihao Lu

The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss.

Stochastic Optimization

Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning

no code implementations7 Apr 2019 Kenji Kawaguchi, Jiaoyang Huang, Leslie Pack Kaelbling

Furthermore, as special cases of our general results, this article improves or complements several state-of-the-art theoretical results on deep neural networks, deep residual networks, and overparameterized deep neural networks with a unified proof technique and novel geometric insights.

Representation Learning

Interpolation Consistency Training for Semi-Supervised Learning

4 code implementations9 Mar 2019 Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Yoshua Bengio, David Lopez-Paz

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm.

General Classification Semi-Supervised Image Classification

Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit

no code implementations12 Jan 2019 Jascha Sohl-Dickstein, Kenji Kawaguchi

Recent work has noted that all bad local minima can be removed from neural network loss landscapes, by adding a single unit with a particular parameterization.

Elimination of All Bad Local Minima in Deep Learning

no code implementations2 Jan 2019 Kenji Kawaguchi, Leslie Pack Kaelbling

At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network.

General Classification Multi-class Classification

Effect of Depth and Width on Local Minima in Deep Learning

no code implementations20 Nov 2018 Kenji Kawaguchi, Jiaoyang Huang, Leslie Pack Kaelbling

In this paper, we analyze the effects of depth and width on the quality of local minima, without strong over-parameterization and simplification assumptions in the literature.

Depth with Nonlinearity Creates No Bad Local Minima in ResNets

no code implementations21 Oct 2018 Kenji Kawaguchi, Yoshua Bengio

In this paper, we prove that depth with nonlinearity creates no bad local minima in a type of arbitrarily deep ResNets with arbitrary nonlinear activation functions, in the sense that the values of all local minima are no worse than the global minimum value of corresponding classical machine-learning models, and are guaranteed to further improve via residual representations.

Generalization in Machine Learning via Analytical Learning Theory

2 code implementations21 Feb 2018 Kenji Kawaguchi, Yoshua Bengio, Vikas Verma, Leslie Pack Kaelbling

This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions.

Learning Theory One-Shot Learning +1

Theory of Deep Learning III: explaining the non-overfitting puzzle

no code implementations30 Dec 2017 Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Xavier Boix, Jack Hidary, Hrushikesh Mhaskar

In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asymptotically stable minima of the empirical error, to linear gradient system in a quadratic potential with a degenerate (for square loss) or almost degenerate (for logistic or crossentropy loss) Hessian.

General Classification

Generalization in Deep Learning

no code implementations16 Oct 2017 Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature.

Deep Semi-Random Features for Nonlinear Function Approximation

1 code implementation28 Feb 2017 Kenji Kawaguchi, Bo Xie, Vikas Verma, Le Song

For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee, and a generalization bound.

Depth Creates No Bad Local Minima

no code implementations27 Feb 2017 Haihao Lu, Kenji Kawaguchi

In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces.

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning

no code implementations19 Oct 2016 Qianli Liao, Kenji Kawaguchi, Tomaso Poggio

We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning.

online learning

Global Continuous Optimization with Error Bound and Fast Convergence

no code implementations17 Jul 2016 Kenji Kawaguchi, Yu Maruyama, Xiaoyu Zheng

This paper proposes a new global optimization algorithm, called Locally Oriented Global Optimization (LOGO), to aim for both fast convergence in practice and finite-time error bound in theory.

Deep Learning without Poor Local Minima

1 code implementation NeurIPS 2016 Kenji Kawaguchi

In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015.

Learning Theory

Bounded Optimal Exploration in MDP

no code implementations5 Apr 2016 Kenji Kawaguchi

Within the framework of probably approximately correct Markov decision processes (PAC-MDP), much theoretical work has focused on methods to attain near optimality after a relatively long period of learning and exploration.

Bayesian Optimization with Exponential Convergence

no code implementations NeurIPS 2015 Kenji Kawaguchi, Leslie Pack Kaelbling, Tomás Lozano-Pérez

This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the delta-cover sampling.

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model

no code implementations13 Mar 2013 Kenji Kawaguchi, Mauricio Araya

This is a natural consequence of the fact that the proposed algorithm is greedier than the other algorithms.


Cannot find the paper you are looking for? You can Submit a new open access paper.