Search Results for author: Kenji Kawaguchi

Found 101 papers, 38 papers with code

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model

no code implementations • 13 Mar 2013 • Kenji Kawaguchi, Mauricio Araya

This is a natural consequence of the fact that the proposed algorithm is greedier than the other algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bounded Optimal Exploration in MDP

no code implementations • 5 Apr 2016 • Kenji Kawaguchi

Within the framework of probably approximately correct Markov decision processes (PAC-MDP), much theoretical work has focused on methods to attain near optimality after a relatively long period of learning and exploration.

Paper
Add Code

Bayesian Optimization with Exponential Convergence

no code implementations • NeurIPS 2015 • Kenji Kawaguchi, Leslie Pack Kaelbling, Tomás Lozano-Pérez

This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the delta-cover sampling.

Bayesian Optimization

Paper
Add Code

Deep Learning without Poor Local Minima

1 code implementation • NeurIPS 2016 • Kenji Kawaguchi

In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015.

Learning Theory

Paper
Code

Global Continuous Optimization with Error Bound and Fast Convergence

no code implementations • 17 Jul 2016 • Kenji Kawaguchi, Yu Maruyama, Xiaoyu Zheng

This paper proposes a new global optimization algorithm, called Locally Oriented Global Optimization (LOGO), to aim for both fast convergence in practice and finite-time error bound in theory.

Management

Paper
Add Code

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning

no code implementations • 19 Oct 2016 • Qianli Liao, Kenji Kawaguchi, Tomaso Poggio

We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning.

Paper
Add Code

Depth Creates No Bad Local Minima

no code implementations • 27 Feb 2017 • Haihao Lu, Kenji Kawaguchi

In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces.

Paper
Add Code

Deep Semi-Random Features for Nonlinear Function Approximation

1 code implementation • 28 Feb 2017 • Kenji Kawaguchi, Bo Xie, Vikas Verma, Le Song

For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee, and a generalization bound.

Paper
Code

Generalization in Deep Learning

no code implementations • 16 Oct 2017 • Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature.

Open-Ended Question Answering

Paper
Add Code

Theory of Deep Learning III: explaining the non-overfitting puzzle

no code implementations • 30 Dec 2017 • Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Xavier Boix, Jack Hidary, Hrushikesh Mhaskar

In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asymptotically stable minima of the empirical error, to linear gradient system in a quadratic potential with a degenerate (for square loss) or almost degenerate (for logistic or crossentropy loss) Hessian.

General Classification

Paper
Add Code

Generalization in Machine Learning via Analytical Learning Theory

2 code implementations • 21 Feb 2018 • Kenji Kawaguchi, Yoshua Bengio, Vikas Verma, Leslie Pack Kaelbling

This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions.

BIG-bench Machine Learning Learning Theory +2

Paper
Code

Depth with Nonlinearity Creates No Bad Local Minima in ResNets

no code implementations • 21 Oct 2018 • Kenji Kawaguchi, Yoshua Bengio

In this paper, we prove that depth with nonlinearity creates no bad local minima in a type of arbitrarily deep ResNets with arbitrary nonlinear activation functions, in the sense that the values of all local minima are no worse than the global minimum value of corresponding classical machine-learning models, and are guaranteed to further improve via residual representations.

BIG-bench Machine Learning Open-Ended Question Answering

Paper
Add Code

Effect of Depth and Width on Local Minima in Deep Learning

no code implementations • 20 Nov 2018 • Kenji Kawaguchi, Jiaoyang Huang, Leslie Pack Kaelbling

In this paper, we analyze the effects of depth and width on the quality of local minima, without strong over-parameterization and simplification assumptions in the literature.

Paper
Add Code

Elimination of All Bad Local Minima in Deep Learning

no code implementations • 2 Jan 2019 • Kenji Kawaguchi, Leslie Pack Kaelbling

At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network.

Binary Classification General Classification +1

Paper
Add Code

Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit

no code implementations • 12 Jan 2019 • Jascha Sohl-Dickstein, Kenji Kawaguchi

Recent work has noted that all bad local minima can be removed from neural network loss landscapes, by adding a single unit with a particular parameterization.

Paper
Add Code

Interpolation Consistency Training for Semi-Supervised Learning

4 code implementations • 9 Mar 2019 • Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Arno Solin, Yoshua Bengio, David Lopez-Paz

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm.

Ranked #2 on Semi-Supervised Image Classification on CIFAR-10, 2000 Labels

General Classification Semi-Supervised Image Classification

140

Paper
Code

Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning

no code implementations • 7 Apr 2019 • Kenji Kawaguchi, Jiaoyang Huang, Leslie Pack Kaelbling

Furthermore, as special cases of our general results, this article improves or complements several state-of-the-art theoretical results on deep neural networks, deep residual networks, and overparameterized deep neural networks with a unified proof technique and novel geometric insights.

BIG-bench Machine Learning Representation Learning

Paper
Add Code

Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy

3 code implementations • 16 Jun 2019 • Alex Lamb, Vikas Verma, Kenji Kawaguchi, Alexander Matyasko, Savya Khosla, Juho Kannala, Yoshua Bengio

Adversarial robustness has become a central goal in deep learning, both in the theory and the practice.

Adversarial Robustness

146

Paper
Code

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

2 code implementations • 9 Jul 2019 • Kenji Kawaguchi, Haihao Lu

The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss.

Stochastic Optimization

Paper
Code

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

no code implementations • 5 Aug 2019 • Kenji Kawaguchi, Jiaoyang Huang

The theory developed in this paper only requires the practical degrees of over-parameterization unlike previous theories.

Paper
Add Code

GraphMix: Improved Training of GNNs for Semi-Supervised Learning

1 code implementation • 25 Sep 2019 • Vikas Verma, Meng Qu, Kenji Kawaguchi, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang

We present GraphMix, a regularization method for Graph Neural Network based semi-supervised object classification, whereby we propose to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization.

Ranked #1 on Node Classification on Pubmed random partition

Generalization Bounds Graph Attention +1

Paper
Code

Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks

no code implementations • 25 Sep 2019 • Ameya D. Jagtap, Kenji Kawaguchi, George Em. Karniadakis

Furthermore, the proposed methods with the slope recovery are shown to accelerate the training process.

Data Augmentation

Paper
Add Code

Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time

no code implementations • NeurIPS 2021 • Ferran Alet, Maria Bauza, Kenji Kawaguchi, Nurullah Giray Kuru, Tomas Lozano-Perez, Leslie Pack Kaelbling

Adding auxiliary losses to the main objective function is a general way of encoding biases that can help networks learn better representations.

Inductive Bias Meta-Learning +1

Paper
Add Code

How Does Mixup Help With Robustness and Generalization?

no code implementations • ICLR 2021 • Linjun Zhang, Zhun Deng, Kenji Kawaguchi, Amirata Ghorbani, James Zou

For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss.

Data Augmentation

Paper
Add Code

Towards Domain-Agnostic Contrastive Learning

no code implementations • 9 Nov 2020 • Vikas Verma, Minh-Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc V. Le

Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation.

Contrastive Learning Data Augmentation +3

Paper
Add Code

Dynamics of Deep Equilibrium Linear Models

no code implementations • ICLR 2021 • Kenji Kawaguchi

A deep equilibrium linear model is implicitly defined through an equilibrium point of an infinite sequence of computation.

Relation

Paper
Add Code

When and How Mixup Improves Calibration

no code implementations • 11 Feb 2021 • Linjun Zhang, Zhun Deng, Kenji Kawaguchi, James Zou

In addition, we study how Mixup improves calibration in semi-supervised learning.

Data Augmentation

Paper
Add Code

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

no code implementations • 15 Feb 2021 • Kenji Kawaguchi

In this paper, we analyze the gradient dynamics of deep equilibrium models with nonlinearity only on weight matrices and non-convex objective functions of weights for regression and classification.

Relation

Paper
Add Code

Clustering Aware Classification for Risk Prediction and Subtyping in Clinical Data

1 code implementation • 23 Feb 2021 • Shivin Srivastava, Siddharth Bhatia, Lingxiao Huang, Lim Jun Heng, Kenji Kawaguchi, Vaibhav Rajan

In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier.

Classification Clustering +2

Paper
Code

A Recipe for Global Convergence Guarantee in Deep Neural Networks

no code implementations • 12 Apr 2021 • Kenji Kawaguchi, Qingyun Sun

Existing global convergence guarantees of (stochastic) gradient descent do not apply to practical deep networks in the practical regime of deep learning beyond the neural tangent kernel (NTK) regime.

Paper
Add Code

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

no code implementations • 10 May 2021 • Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, Kenji Kawaguchi

Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution.

Paper
Add Code

Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions

2 code implementations • 20 May 2021 • Ameya D. Jagtap, Yeonjong Shin, Kenji Kawaguchi, George Em Karniadakis

We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions.

Paper
Code

MemStream: Memory-Based Streaming Anomaly Detection

1 code implementation • 7 Jun 2021 • Siddharth Bhatia, Arjit Jain, Shivin Srivastava, Kenji Kawaguchi, Bryan Hooi

Given a stream of entries over time in a multi-dimensional data setting where concept drift is present, how can we detect anomalous activities?

Denoising Unsupervised Anomaly Detection

Paper
Code

Sketch-Based Anomaly Detection in Streaming Graphs

1 code implementation • 8 Jun 2021 • Siddharth Bhatia, Mohit Wadhwa, Kenji Kawaguchi, Neil Shah, Philip S. Yu, Bryan Hooi

This higher-order sketch has the useful property of preserving the dense subgraph structure (dense subgraphs in the input turn into dense submatrices in the data structure).

Anomaly Detection Intrusion Detection

Paper
Code

Adversarial Training Helps Transfer Learning via Better Representations

no code implementations • NeurIPS 2021 • Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Zou

Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.

Transfer Learning

Paper
Add Code

Understanding Dynamics of Nonlinear Representation Learning and Its Application

no code implementations • 28 Jun 2021 • Kenji Kawaguchi, Linjun Zhang, Zhun Deng

Representation learning allows us to automatically discover suitable representations from raw sensory data.

Representation Learning

Paper
Add Code

Discrete-Valued Neural Communication

no code implementations • NeurIPS 2021 • Dianbo Liu, Alex Lamb, Kenji Kawaguchi, Anirudh Goyal, Chen Sun, Michael Curtis Mozer, Yoshua Bengio

Deep learning has advanced from fully connected architectures to structured models organized into components, e. g., the transformer composed of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes.

Quantization Systematic Generalization

Paper
Add Code

Meta-learning PINN loss functions

no code implementations • 12 Jul 2021 • Apostolos F Psaros, Kenji Kawaguchi, George Em Karniadakis

In the computational examples, the meta-learned losses are employed at test time for addressing regression and PDE task distributions.

Meta-Learning

Paper
Add Code

When Do Extended Physics-Informed Neural Networks (XPINNs) Improve Generalization?

no code implementations • 20 Sep 2021 • Zheyuan Hu, Ameya D. Jagtap, George Em Karniadakis, Kenji Kawaguchi

Specifically, for general multi-layer PINNs and XPINNs, we first provide a prior generalization bound via the complexity of the target functions in the PDE problem, and a posterior generalization bound via the posterior matrix norms of the networks after optimization.

Paper
Add Code

Combined Scaling for Zero-shot Transfer Learning

no code implementations • 19 Nov 2021 • Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Ranked #3 on Zero-Shot Transfer Image Classification on ImageNet-Sketch

Classification Contrastive Learning +3

Paper
Add Code

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

no code implementations • NeurIPS 2021 • Clement Gehring, Kenji Kawaguchi, Jiaoyang Huang, Leslie Kaelbling

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Noether Networks: Meta-Learning Useful Conserved Quantities

no code implementations • NeurIPS 2021 • Ferran Alet, Dylan Doblar, Allan Zhou, Joshua Tenenbaum, Kenji Kawaguchi, Chelsea Finn

Progress in machine learning (ML) stems from a combination of data availability, computational resources, and an appropriate encoding of inductive biases.

Meta-Learning Translation

Paper
Add Code

Training Free Graph Neural Networks for Graph Matching

1 code implementation • 14 Jan 2022 • Zhiyuan Liu, Yixin Cao, Fuli Feng, Xiang Wang, Jie Tang, Kenji Kawaguchi, Tat-Seng Chua

We present a framework of Training Free Graph Matching (TFGM) to boost the performance of Graph Neural Networks (GNNs) based graph matching, providing a fast promising solution without training (training-free).

Entity Alignment Graph Matching +1

Paper
Code

ExpertNet: A Symbiosis of Classification and Clustering

no code implementations • 17 Jan 2022 • Shivin Srivastava, Kenji Kawaguchi, Vaibhav Rajan

We theoretically analyze the effect of clustering on its generalization gap, and empirically show that clustered latent representations from ExpertNet lead to disentangling the intrinsic structure and improvement in classification performance.

Classification Clustering +1

Paper
Add Code

Multi-Task Learning as a Bargaining Game

2 code implementations • 2 Feb 2022 • Aviv Navon, Aviv Shamsian, Idan Achituve, Haggai Maron, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya

In this paper, we propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.

Ranked #1 on Multi-Task Learning on Cityscapes test

Multi-Task Learning

178

Paper
Code

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

no code implementations • 2 Feb 2022 • Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit.

Quantization reinforcement-learning +2

Paper
Add Code

EIGNN: Efficient Infinite-Depth Graph Neural Networks

1 code implementation • NeurIPS 2021 • Juncheng Liu, Kenji Kawaguchi, Bryan Hooi, Yiwei Wang, Xiaokui Xiao

Motivated by this limitation, we propose a GNN model with infinite depth, which we call Efficient Infinite-Depth Graph Neural Networks (EIGNN), to efficiently capture very long-range dependencies.

Paper
Code

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

1 code implementation • 1 Apr 2022 • Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Ankit Vani, Michael Noukhovitch, Kenji Kawaguchi, Aaron Courville

Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation.

Classification Inductive Bias +1

Paper
Code

Set-based Meta-Interpolation for Few-Task Meta-Learning

no code implementations • 20 May 2022 • Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Recently, several task augmentation methods have been proposed to tackle this issue using domain-specific knowledge to design augmentation techniques to densify the meta-training task distribution.

Bilevel Optimization Image Classification +6

Paper
Add Code

Robustness Implies Generalization via Data-Dependent Generalization Bounds

no code implementations • 27 Jun 2022 • Kenji Kawaguchi, Zhun Deng, Kyle Luh, Jiaoyang Huang

This paper proves that robustness implies generalization via data-dependent generalization bounds.

Generalization Bounds

Paper
Add Code

Discrete Key-Value Bottleneck

1 code implementation • 22 Jul 2022 • Frederik Träuble, Anirudh Goyal, Nasim Rahaman, Michael Mozer, Kenji Kawaguchi, Yoshua Bengio, Bernhard Schölkopf

Deep neural networks perform well on classification tasks where data streams are i. i. d.

Class Incremental Learning Incremental Learning

Paper
Code

Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation

1 code implementation • 26 Aug 2022 • Jeffrey Willette, Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Recent work on mini-batch consistency (MBC) for set functions has brought attention to the need for sequentially processing and aggregating chunks of a partitioned set while guaranteeing the same output for all partitions.

Point Cloud Classification text-classification +1

Paper
Code

Self-Distillation for Further Pre-training of Transformers

no code implementations • 30 Sep 2022 • Seanie Lee, Minki Kang, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi

Pre-training a large transformer model on a massive amount of unlabeled data and fine-tuning it on labeled datasets for diverse downstream tasks has proven to be a successful strategy, for a variety of vision and natural language processing tasks.

text-classification Text Classification

Paper
Add Code

MGNNI: Multiscale Graph Neural Networks with Implicit Layers

1 code implementation • 15 Oct 2022 • Juncheng Liu, Bryan Hooi, Kenji Kawaguchi, Xiaokui Xiao

Recently, implicit graph neural networks (GNNs) have been proposed to capture long-range dependencies in underlying graphs.

Graph Classification Node Classification

Paper
Code

GFlowOut: Dropout with Generative Flow Networks

no code implementations • 24 Oct 2022 • Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio

These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.

Bayesian Inference Variational Inference

Paper
Add Code

TuneUp: A Simple Improved Training Strategy for Graph Neural Networks

no code implementations • 26 Oct 2022 • Weihua Hu, Kaidi Cao, Kexin Huang, Edward W Huang, Karthik Subbian, Kenji Kawaguchi, Jure Leskovec

Extensive evaluation of TuneUp on five diverse GNN architectures, three types of prediction tasks, and both transductive and inductive settings shows that TuneUp significantly improves the performance of the base GNN on tail nodes, while often even improving the performance on head nodes.

Data Augmentation

Paper
Add Code

Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning

no code implementations • 1 Nov 2022 • Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes

Goal-conditioned reinforcement learning (RL) is a promising direction for training agents that are capable of solving multiple tasks and reach a diverse set of objectives.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Neural Active Learning on Heteroskedastic Distributions

1 code implementation • 2 Nov 2022 • Savya Khosla, Chew Kin Whye, Jordan T. Ash, Cyril Zhang, Kenji Kawaguchi, Alex Lamb

To this end, we demonstrate the catastrophic failure of these active learning algorithms on heteroskedastic distributions and propose a fine-tuning-based approach to mitigate these failures.

Active Learning

Paper
Code

Augmented Physics-Informed Neural Networks (APINNs): A gating network-based soft domain decomposition methodology

1 code implementation • 16 Nov 2022 • Zheyuan Hu, Ameya D. Jagtap, George Em Karniadakis, Kenji Kawaguchi

We also show cases where XPINN is already better than PINN, so APINN can still slightly improve XPINN.

Paper
Code

Single-Pass Contrastive Learning Can Work for Both Homophilic and Heterophilic Graph

1 code implementation • 20 Nov 2022 • Haonan Wang, Jieyu Zhang, Qi Zhu, Wei Huang, Kenji Kawaguchi, Xiaokui Xiao

To answer this question, we theoretically study the concentration property of features obtained by neighborhood aggregation on homophilic and heterophilic graphs, introduce the single-pass augmentation-free graph contrastive learning loss based on the property, and provide performance guarantees for the minimizer of the loss on downstream tasks.

Contrastive Learning

Paper
Code

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

1 code implementation • 27 Dec 2022 • Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels.

Data Augmentation

Paper
Code

Auxiliary Learning as an Asymmetric Bargaining Game

1 code implementation • 31 Jan 2023 • Aviv Shamsian, Aviv Navon, Neta Glazer, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya

Auxiliary learning is an effective method for enhancing the generalization capabilities of trained models, particularly when dealing with small datasets.

Auxiliary Learning

Paper
Code

D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

no code implementations • 1 Mar 2023 • Tianbo Li, Min Lin, Zheyuan Hu, Kunhao Zheng, Giovanni Vignale, Kenji Kawaguchi, A. H. Castro Neto, Kostya S. Novoselov, Shuicheng Yan

Kohn-Sham Density Functional Theory (KS-DFT) has been traditionally solved by the Self-Consistent Field (SCF) method.

Numerical Integration Total Energy

Paper
Add Code

An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

no code implementations • 1 Mar 2023 • Ravid Shwartz-Ziv, Randall Balestriero, Kenji Kawaguchi, Tim G. J. Rudner, Yann Lecun

In this paper, we provide an information-theoretic perspective on Variance-Invariance-Covariance Regularization (VICReg) for self-supervised learning.

Self-Supervised Learning Transfer Learning

Paper
Add Code

Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks

2 code implementations • 8 Apr 2023 • Yuzhen Mao, Zhun Deng, Huaxiu Yao, Ting Ye, Kenji Kawaguchi, James Zou

As machine learning has been deployed ubiquitously across applications in modern data science, algorithmic fairness has become a great concern.

Fairness Open-Ended Question Answering +1

Paper
Code

Self-Evaluation Guided Beam Search for Reasoning

no code implementations • NeurIPS 2023 • Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie

Stochastic beam search balances exploitation and exploration of the search space with temperature-controlled randomness.

Arithmetic Reasoning GSM8K +3

Paper
Add Code

Boosting Visual-Language Models by Exploiting Hard Samples

1 code implementation • 9 May 2023 • Haonan Wang, Minbin Huang, Runhui Huang, Lanqing Hong, Hang Xu, Tianyang Hu, Xiaodan Liang, Zhenguo Li, Hong Cheng, Kenji Kawaguchi

In this work, we present HELIP, a cost-effective strategy tailored to enhance the performance of existing CLIP models without the need for training a model from scratch or collecting additional data.

Retrieval Zero-Shot Learning

Paper
Code

Automatic Model Selection with Large Language Models for Reasoning

1 code implementation • 23 May 2023 • James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie

Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths.

Ranked #1 on Math Word Problem Solving on SVAMP

Arithmetic Reasoning GSM8K +4

Paper
Code

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

1 code implementation • NeurIPS 2023 • Minki Kang, Seanie Lee, Jinheon Baek, Kenji Kawaguchi, Sung Ju Hwang

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge.

Memorization StrategyQA

Paper
Code

How Does Information Bottleneck Help Deep Learning?

1 code implementation • 30 May 2023 • Kenji Kawaguchi, Zhun Deng, Xu Ji, Jiaoyang Huang

In this paper, we provide the first rigorous learning theory for justifying the benefit of information bottleneck in deep learning by mathematically relating information bottleneck to generalization errors.

Generalization Bounds Learning Theory

Paper
Code

Fast Diffusion Model

1 code implementation • 12 Jun 2023 • Zike Wu, Pan Zhou, Kenji Kawaguchi, Hanwang Zhang

In this paper, we propose a Fast Diffusion Model (FDM) to significantly speed up DMs from a stochastic optimization perspective for both faster training and sampling.

Image Generation

Paper
Code

Multi-View Class Incremental Learning

no code implementations • 16 Jun 2023 • Depeng Li, Tianqi Wang, Junwei Chen, Kenji Kawaguchi, Cheng Lian, Zhigang Zeng

Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.

Class Incremental Learning Decision Making +3

Paper
Add Code

IF2Net: Innately Forgetting-Free Networks for Continual Learning

no code implementations • 18 Jun 2023 • Depeng Li, Tianqi Wang, Bingrong Xu, Kenji Kawaguchi, Zhigang Zeng, Ponnuthurai Nagaratnam Suganthan

Continual learning can incrementally absorb new concepts without interfering with previously learned knowledge.

Continual Learning Decision Making

Paper
Add Code

Tackling the Curse of Dimensionality with Physics-Informed Neural Networks

no code implementations • 23 Jul 2023 • Zheyuan Hu, Khemraj Shukla, George Em Karniadakis, Kenji Kawaguchi

We demonstrate in various diverse tests that the proposed method can solve many notoriously hard high-dimensional PDEs, including the Hamilton-Jacobi-Bellman (HJB) and the Schr\"{o}dinger equations in tens of thousands of dimensions very fast on a single GPU using the PINNs mesh-free approach.

Paper
Add Code

A Dual-Perspective Approach to Evaluating Feature Attribution Methods

no code implementations • 17 Aug 2023 • Yawei Li, Yang Zhang, Kenji Kawaguchi, Ashkan Khakzar, Bernd Bischl, Mina Rezaei

We apply these metrics to mainstream attribution methods, offering a novel lens through which to analyze and compare feature attribution methods.

Paper
Add Code

On Copyright Risks of Text-to-Image Diffusion Models

no code implementations • 15 Sep 2023 • Yang Zhang, Teoh Tze Tzun, Lim Wei Hern, Haonan Wang, Kenji Kawaguchi

Specifically, we introduce a data generation pipeline to systematically produce data for studying copyright in diffusion models.

Paper
Add Code

Drug Discovery with Dynamic Goal-aware Fragments

no code implementations • 2 Oct 2023 • Seul Lee, Seanie Lee, Kenji Kawaguchi, Sung Ju Hwang

Additionally, the existing fragment-based generative models cannot update the fragment vocabulary with goal-aware fragments newly discovered during the generation.

Drug Discovery

Paper
Add Code

Self-Supervised Dataset Distillation for Transfer Learning

2 code implementations • 10 Oct 2023 • Dong Bok Lee, Seanie Lee, Joonho Ko, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

To achieve this, we also introduce the MSE between representations of the inner model and the self-supervised target model on the original full dataset for outer optimization.

Bilevel Optimization Meta-Learning +3

1,150

Paper
Code

AttributionLab: Faithfulness of Feature Attribution Under Controllable Environments

no code implementations • 10 Oct 2023 • Yang Zhang, Yawei Li, Hannah Brown, Mina Rezaei, Bernd Bischl, Philip Torr, Ashkan Khakzar, Kenji Kawaguchi

Feature attribution explains neural network outputs by identifying relevant input features.

Paper
Add Code

MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

1 code implementation • 19 Oct 2023 • Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua

MolCA enables an LM (e. g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector.

Ranked #4 on Molecule Captioning on ChEBI-20

Contrastive Learning IUPAC Name Prediction +5

Paper
Code

Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules

1 code implementation • NeurIPS 2023 • Zhiyuan Liu, Yaorui Shi, An Zhang, Enzhi Zhang, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua

Our results show that a subgraph-level tokenizer and a sufficiently expressive decoder with remask decoding have a large impact on the encoder's representation learning.

Representation Learning Self-Supervised Learning

Paper
Code

ChOiRe: Characterizing and Predicting Human Opinions with Chain of Opinion Reasoning

no code implementations • 14 Nov 2023 • Xuan Long Do, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen

Aligning language models (LMs) with human opinion is challenging yet vital to enhance their grasp of human values, preferences, and beliefs.

Paper
Add Code

Bias-Variance Trade-off in Physics-Informed Neural Networks with Randomized Smoothing for High-Dimensional PDEs

no code implementations • 26 Nov 2023 • Zheyuan Hu, Zhouhao Yang, Yezhen Wang, George Em Karniadakis, Kenji Kawaguchi

To optimize the bias-variance trade-off, we combine the two approaches in a hybrid method that balances the rapid convergence of the biased version with the high accuracy of the unbiased version.

Computational Efficiency

Paper
Add Code

VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models

1 code implementation • 29 Nov 2023 • Xiang Li, Qianli Shen, Kenji Kawaguchi

The booming use of text-to-image generative models has raised concerns about their high risk of producing copyright-infringing content.

Paper
Code

Learning Unorthogonalized Matrices for Rotation Estimation

no code implementations • 1 Dec 2023 • Kerui Gu, Zhihao LI, Shiyong Liu, Jianzhuang Liu, Songcen Xu, Youliang Yan, Michael Bi Mi, Kenji Kawaguchi, Angela Yao

Estimating 3D rotations is a common procedure for 3D computer vision.

Pose Estimation

Paper
Add Code

Prompt Optimization via Adversarial In-Context Learning

no code implementations • 5 Dec 2023 • Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He

We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier.

Arithmetic Reasoning Data-to-Text Generation +2

Paper
Add Code

Hutchinson Trace Estimation for High-Dimensional and High-Order Physics-Informed Neural Networks

1 code implementation • 22 Dec 2023 • Zheyuan Hu, Zekun Shi, George Em Karniadakis, Kenji Kawaguchi

We further showcase HTE's convergence to the original PINN loss and its unbiased behavior under specific conditions.

Paper
Code

Can AI Be as Creative as Humans?

no code implementations • 3 Jan 2024 • Haonan Wang, James Zou, Michael Mozer, Anirudh Goyal, Alex Lamb, Linjun Zhang, Weijie J Su, Zhun Deng, Michael Qizhe Xie, Hannah Brown, Kenji Kawaguchi

With the rise of advanced generative AI models capable of tasks once reserved for human creativity, the study of AI's creative potential becomes imperative for its responsible development and application.

Paper
Add Code

Simple Hierarchical Planning with Diffusion

no code implementations • 5 Jan 2024 • Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn

Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets.

Paper
Add Code

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

no code implementations • 7 Jan 2024 • Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi

This study explores the vulnerabilities associated with copyright protection in DMs by introducing a backdoor data poisoning attack (SilentBadDiffusion) against text-to-image diffusion models.

Data Poisoning Image Inpainting

Paper
Add Code

Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding

no code implementations • 17 Jan 2024 • Depeng Li, Tianqi Wang, Junwei Chen, Qining Ren, Kenji Kawaguchi, Zhigang Zeng

Deep neural networks are susceptible to catastrophic forgetting when trained on sequential tasks.

Continual Learning

Paper
Add Code

Towards 3D Molecule-Text Interpretation in Language Models

1 code implementation • 25 Jan 2024 • Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian

Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.

Instruction Following Language Modelling +3

Paper
Code

Score-Based Physics-Informed Neural Networks for High-Dimensional Fokker-Planck Equations

no code implementations • 12 Feb 2024 • Zheyuan Hu, Zhongqiang Zhang, George Em Karniadakis, Kenji Kawaguchi

The score function, defined as the gradient of the LL, plays a fundamental role in inferring LL and PDF and enables fast SDE sampling.

Paper
Add Code

Unsupervised Concept Discovery Mitigates Spurious Correlations

no code implementations • 20 Feb 2024 • Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi

Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases.

Representation Learning

Paper
Add Code

The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling

no code implementations • 23 Feb 2024 • Jiajun Ma, Shuchen Xue, Tianyang Hu, Wenjia Wang, Zhaoqiang Liu, Zhenguo Li, Zhi-Ming Ma, Kenji Kawaguchi

Surprisingly, the improvement persists when we increase the number of sampling steps and can even surpass the best result from EDM-2 (1. 58) with only 39 NFEs (1. 57).

Image Generation

Paper
Add Code

Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion

no code implementations • 26 Feb 2024 • Xuantong Liu, Tianyang Hu, Wenjia Wang, Kenji Kawaguchi, Yuan YAO

In this work, we aim to address this alignment challenge for conditional generation tasks.

Text-to-Image Generation

Paper
Add Code

AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging

1 code implementation • 29 Feb 2024 • Yiran Zhao, Wenxuan Zhang, Huiming Wang, Kenji Kawaguchi, Lidong Bing

In this paper, we acknowledge the mutual reliance between task ability and language ability and direct our attention toward the gap between the target language and the source language on tasks.

Cross-Lingual Transfer

Paper
Code

How do Large Language Models Handle Multilingualism?

no code implementations • 29 Feb 2024 • Yiran Zhao, Wenxuan Zhang, Guizhen Chen, Kenji Kawaguchi, Lidong Bing

We introduce a framework that depicts LLMs' processing of multilingual inputs: In the first several layers, LLMs understand the question, converting multilingual inputs into English to facilitate the task-solving phase.

Paper
Add Code

Accelerating Greedy Coordinate Gradient via Probe Sampling

1 code implementation • 2 Mar 2024 • Yiran Zhao, Wenyue Zheng, Tianle Cai, Xuan Long Do, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh

Safety of Large Language Models (LLMs) has become a central issue given their rapid progress and wide applications.

Paper
Code

Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models

1 code implementation • 11 Mar 2024 • Yang Zhang, Teoh Tze Tzun, Lim Wei Hern, Tiviatis Sim, Kenji Kawaguchi

Recent advancements in diffusion models have notably improved the perceptual quality of generated images in text-to-image synthesis tasks.

Image Generation

Paper
Code

Towards Robust Out-of-Distribution Generalization Bounds via Sharpness

no code implementations • 11 Mar 2024 • Yingtian Zou, Kenji Kawaguchi, Yingnan Liu, Jiashuo Liu, Mong-Li Lee, Wynne Hsu

To bridge this gap between optimization and OOD generalization, we study the effect of sharpness on how a model tolerates data change in domain shift which is usually captured by "robustness" in generalization.

Generalization Bounds Out-of-Distribution Generalization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.