Search Results for author: Kenji Kawaguchi

Found 101 papers, 38 papers with code

Bounded Optimal Exploration in MDP

no code implementations5 Apr 2016 Kenji Kawaguchi

Within the framework of probably approximately correct Markov decision processes (PAC-MDP), much theoretical work has focused on methods to attain near optimality after a relatively long period of learning and exploration.

Bayesian Optimization with Exponential Convergence

no code implementations NeurIPS 2015 Kenji Kawaguchi, Leslie Pack Kaelbling, Tomás Lozano-Pérez

This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the delta-cover sampling.

Bayesian Optimization

Deep Learning without Poor Local Minima

1 code implementation NeurIPS 2016 Kenji Kawaguchi

In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015.

Learning Theory

Global Continuous Optimization with Error Bound and Fast Convergence

no code implementations17 Jul 2016 Kenji Kawaguchi, Yu Maruyama, Xiaoyu Zheng

This paper proposes a new global optimization algorithm, called Locally Oriented Global Optimization (LOGO), to aim for both fast convergence in practice and finite-time error bound in theory.

Management

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning

no code implementations19 Oct 2016 Qianli Liao, Kenji Kawaguchi, Tomaso Poggio

We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online learning and (2) recurrent learning.

Depth Creates No Bad Local Minima

no code implementations27 Feb 2017 Haihao Lu, Kenji Kawaguchi

In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces.

Deep Semi-Random Features for Nonlinear Function Approximation

1 code implementation28 Feb 2017 Kenji Kawaguchi, Bo Xie, Vikas Verma, Le Song

For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee, and a generalization bound.

Generalization in Deep Learning

no code implementations16 Oct 2017 Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature.

Open-Ended Question Answering

Theory of Deep Learning III: explaining the non-overfitting puzzle

no code implementations30 Dec 2017 Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Xavier Boix, Jack Hidary, Hrushikesh Mhaskar

In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asymptotically stable minima of the empirical error, to linear gradient system in a quadratic potential with a degenerate (for square loss) or almost degenerate (for logistic or crossentropy loss) Hessian.

General Classification

Generalization in Machine Learning via Analytical Learning Theory

2 code implementations21 Feb 2018 Kenji Kawaguchi, Yoshua Bengio, Vikas Verma, Leslie Pack Kaelbling

This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions.

BIG-bench Machine Learning Learning Theory +2

Depth with Nonlinearity Creates No Bad Local Minima in ResNets

no code implementations21 Oct 2018 Kenji Kawaguchi, Yoshua Bengio

In this paper, we prove that depth with nonlinearity creates no bad local minima in a type of arbitrarily deep ResNets with arbitrary nonlinear activation functions, in the sense that the values of all local minima are no worse than the global minimum value of corresponding classical machine-learning models, and are guaranteed to further improve via residual representations.

BIG-bench Machine Learning Open-Ended Question Answering

Effect of Depth and Width on Local Minima in Deep Learning

no code implementations20 Nov 2018 Kenji Kawaguchi, Jiaoyang Huang, Leslie Pack Kaelbling

In this paper, we analyze the effects of depth and width on the quality of local minima, without strong over-parameterization and simplification assumptions in the literature.

Elimination of All Bad Local Minima in Deep Learning

no code implementations2 Jan 2019 Kenji Kawaguchi, Leslie Pack Kaelbling

At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network.

Binary Classification General Classification +1

Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit

no code implementations12 Jan 2019 Jascha Sohl-Dickstein, Kenji Kawaguchi

Recent work has noted that all bad local minima can be removed from neural network loss landscapes, by adding a single unit with a particular parameterization.

Interpolation Consistency Training for Semi-Supervised Learning

4 code implementations9 Mar 2019 Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Arno Solin, Yoshua Bengio, David Lopez-Paz

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm.

General Classification Semi-Supervised Image Classification

Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning

no code implementations7 Apr 2019 Kenji Kawaguchi, Jiaoyang Huang, Leslie Pack Kaelbling

Furthermore, as special cases of our general results, this article improves or complements several state-of-the-art theoretical results on deep neural networks, deep residual networks, and overparameterized deep neural networks with a unified proof technique and novel geometric insights.

BIG-bench Machine Learning Representation Learning

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

2 code implementations9 Jul 2019 Kenji Kawaguchi, Haihao Lu

The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss.

Stochastic Optimization

Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

no code implementations5 Aug 2019 Kenji Kawaguchi, Jiaoyang Huang

The theory developed in this paper only requires the practical degrees of over-parameterization unlike previous theories.

GraphMix: Improved Training of GNNs for Semi-Supervised Learning

1 code implementation25 Sep 2019 Vikas Verma, Meng Qu, Kenji Kawaguchi, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang

We present GraphMix, a regularization method for Graph Neural Network based semi-supervised object classification, whereby we propose to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization.

Generalization Bounds Graph Attention +1

How Does Mixup Help With Robustness and Generalization?

no code implementations ICLR 2021 Linjun Zhang, Zhun Deng, Kenji Kawaguchi, Amirata Ghorbani, James Zou

For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss.

Data Augmentation

Towards Domain-Agnostic Contrastive Learning

no code implementations9 Nov 2020 Vikas Verma, Minh-Thang Luong, Kenji Kawaguchi, Hieu Pham, Quoc V. Le

Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation.

Contrastive Learning Data Augmentation +3

Dynamics of Deep Equilibrium Linear Models

no code implementations ICLR 2021 Kenji Kawaguchi

A deep equilibrium linear model is implicitly defined through an equilibrium point of an infinite sequence of computation.

Relation

When and How Mixup Improves Calibration

no code implementations11 Feb 2021 Linjun Zhang, Zhun Deng, Kenji Kawaguchi, James Zou

In addition, we study how Mixup improves calibration in semi-supervised learning.

Data Augmentation

On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers

no code implementations15 Feb 2021 Kenji Kawaguchi

In this paper, we analyze the gradient dynamics of deep equilibrium models with nonlinearity only on weight matrices and non-convex objective functions of weights for regression and classification.

Relation

Clustering Aware Classification for Risk Prediction and Subtyping in Clinical Data

1 code implementation23 Feb 2021 Shivin Srivastava, Siddharth Bhatia, Lingxiao Huang, Lim Jun Heng, Kenji Kawaguchi, Vaibhav Rajan

In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier.

Classification Clustering +2

A Recipe for Global Convergence Guarantee in Deep Neural Networks

no code implementations12 Apr 2021 Kenji Kawaguchi, Qingyun Sun

Existing global convergence guarantees of (stochastic) gradient descent do not apply to practical deep networks in the practical regime of deep learning beyond the neural tangent kernel (NTK) regime.

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

no code implementations10 May 2021 Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, Kenji Kawaguchi

Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution.

Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions

2 code implementations20 May 2021 Ameya D. Jagtap, Yeonjong Shin, Kenji Kawaguchi, George Em Karniadakis

We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions.

MemStream: Memory-Based Streaming Anomaly Detection

1 code implementation7 Jun 2021 Siddharth Bhatia, Arjit Jain, Shivin Srivastava, Kenji Kawaguchi, Bryan Hooi

Given a stream of entries over time in a multi-dimensional data setting where concept drift is present, how can we detect anomalous activities?

Denoising Unsupervised Anomaly Detection

Sketch-Based Anomaly Detection in Streaming Graphs

1 code implementation8 Jun 2021 Siddharth Bhatia, Mohit Wadhwa, Kenji Kawaguchi, Neil Shah, Philip S. Yu, Bryan Hooi

This higher-order sketch has the useful property of preserving the dense subgraph structure (dense subgraphs in the input turn into dense submatrices in the data structure).

Anomaly Detection Intrusion Detection

Adversarial Training Helps Transfer Learning via Better Representations

no code implementations NeurIPS 2021 Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Zou

Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.

Transfer Learning

Understanding Dynamics of Nonlinear Representation Learning and Its Application

no code implementations28 Jun 2021 Kenji Kawaguchi, Linjun Zhang, Zhun Deng

Representation learning allows us to automatically discover suitable representations from raw sensory data.

Representation Learning

Discrete-Valued Neural Communication

no code implementations NeurIPS 2021 Dianbo Liu, Alex Lamb, Kenji Kawaguchi, Anirudh Goyal, Chen Sun, Michael Curtis Mozer, Yoshua Bengio

Deep learning has advanced from fully connected architectures to structured models organized into components, e. g., the transformer composed of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes.

Quantization Systematic Generalization

Meta-learning PINN loss functions

no code implementations12 Jul 2021 Apostolos F Psaros, Kenji Kawaguchi, George Em Karniadakis

In the computational examples, the meta-learned losses are employed at test time for addressing regression and PDE task distributions.

Meta-Learning

When Do Extended Physics-Informed Neural Networks (XPINNs) Improve Generalization?

no code implementations20 Sep 2021 Zheyuan Hu, Ameya D. Jagtap, George Em Karniadakis, Kenji Kawaguchi

Specifically, for general multi-layer PINNs and XPINNs, we first provide a prior generalization bound via the complexity of the target functions in the PDE problem, and a posterior generalization bound via the posterior matrix norms of the networks after optimization.

Combined Scaling for Zero-shot Transfer Learning

no code implementations19 Nov 2021 Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Classification Contrastive Learning +3

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

no code implementations NeurIPS 2021 Clement Gehring, Kenji Kawaguchi, Jiaoyang Huang, Leslie Kaelbling

Estimating the per-state expected cumulative rewards is a critical aspect of reinforcement learning approaches, however the experience is obtained, but standard deep neural-network function-approximation methods are often inefficient in this setting.

Model-based Reinforcement Learning reinforcement-learning +1

Noether Networks: Meta-Learning Useful Conserved Quantities

no code implementations NeurIPS 2021 Ferran Alet, Dylan Doblar, Allan Zhou, Joshua Tenenbaum, Kenji Kawaguchi, Chelsea Finn

Progress in machine learning (ML) stems from a combination of data availability, computational resources, and an appropriate encoding of inductive biases.

Meta-Learning Translation

Training Free Graph Neural Networks for Graph Matching

1 code implementation14 Jan 2022 Zhiyuan Liu, Yixin Cao, Fuli Feng, Xiang Wang, Jie Tang, Kenji Kawaguchi, Tat-Seng Chua

We present a framework of Training Free Graph Matching (TFGM) to boost the performance of Graph Neural Networks (GNNs) based graph matching, providing a fast promising solution without training (training-free).

Entity Alignment Graph Matching +1

ExpertNet: A Symbiosis of Classification and Clustering

no code implementations17 Jan 2022 Shivin Srivastava, Kenji Kawaguchi, Vaibhav Rajan

We theoretically analyze the effect of clustering on its generalization gap, and empirically show that clustered latent representations from ExpertNet lead to disentangling the intrinsic structure and improvement in classification performance.

Classification Clustering +1

Multi-Task Learning as a Bargaining Game

2 code implementations2 Feb 2022 Aviv Navon, Aviv Shamsian, Idan Achituve, Haggai Maron, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya

In this paper, we propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.

Multi-Task Learning

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

no code implementations2 Feb 2022 Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit.

Quantization reinforcement-learning +2

EIGNN: Efficient Infinite-Depth Graph Neural Networks

1 code implementation NeurIPS 2021 Juncheng Liu, Kenji Kawaguchi, Bryan Hooi, Yiwei Wang, Xiaokui Xiao

Motivated by this limitation, we propose a GNN model with infinite depth, which we call Efficient Infinite-Depth Graph Neural Networks (EIGNN), to efficiently capture very long-range dependencies.

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

1 code implementation1 Apr 2022 Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Ankit Vani, Michael Noukhovitch, Kenji Kawaguchi, Aaron Courville

Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation.

Classification Inductive Bias +1

Set-based Meta-Interpolation for Few-Task Meta-Learning

no code implementations20 May 2022 Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Recently, several task augmentation methods have been proposed to tackle this issue using domain-specific knowledge to design augmentation techniques to densify the meta-training task distribution.

Bilevel Optimization Image Classification +6

Robustness Implies Generalization via Data-Dependent Generalization Bounds

no code implementations27 Jun 2022 Kenji Kawaguchi, Zhun Deng, Kyle Luh, Jiaoyang Huang

This paper proves that robustness implies generalization via data-dependent generalization bounds.

Generalization Bounds

Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation

1 code implementation26 Aug 2022 Jeffrey Willette, Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Recent work on mini-batch consistency (MBC) for set functions has brought attention to the need for sequentially processing and aggregating chunks of a partitioned set while guaranteeing the same output for all partitions.

Point Cloud Classification text-classification +1

Self-Distillation for Further Pre-training of Transformers

no code implementations30 Sep 2022 Seanie Lee, Minki Kang, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi

Pre-training a large transformer model on a massive amount of unlabeled data and fine-tuning it on labeled datasets for diverse downstream tasks has proven to be a successful strategy, for a variety of vision and natural language processing tasks.

text-classification Text Classification

MGNNI: Multiscale Graph Neural Networks with Implicit Layers

1 code implementation15 Oct 2022 Juncheng Liu, Bryan Hooi, Kenji Kawaguchi, Xiaokui Xiao

Recently, implicit graph neural networks (GNNs) have been proposed to capture long-range dependencies in underlying graphs.

Graph Classification Node Classification

GFlowOut: Dropout with Generative Flow Networks

no code implementations24 Oct 2022 Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio

These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.

Bayesian Inference Variational Inference

TuneUp: A Simple Improved Training Strategy for Graph Neural Networks

no code implementations26 Oct 2022 Weihua Hu, Kaidi Cao, Kexin Huang, Edward W Huang, Karthik Subbian, Kenji Kawaguchi, Jure Leskovec

Extensive evaluation of TuneUp on five diverse GNN architectures, three types of prediction tasks, and both transductive and inductive settings shows that TuneUp significantly improves the performance of the base GNN on tail nodes, while often even improving the performance on head nodes.

Data Augmentation

Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning

no code implementations1 Nov 2022 Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes

Goal-conditioned reinforcement learning (RL) is a promising direction for training agents that are capable of solving multiple tasks and reach a diverse set of objectives.

reinforcement-learning Reinforcement Learning (RL)

Neural Active Learning on Heteroskedastic Distributions

1 code implementation2 Nov 2022 Savya Khosla, Chew Kin Whye, Jordan T. Ash, Cyril Zhang, Kenji Kawaguchi, Alex Lamb

To this end, we demonstrate the catastrophic failure of these active learning algorithms on heteroskedastic distributions and propose a fine-tuning-based approach to mitigate these failures.

Active Learning

Single-Pass Contrastive Learning Can Work for Both Homophilic and Heterophilic Graph

1 code implementation20 Nov 2022 Haonan Wang, Jieyu Zhang, Qi Zhu, Wei Huang, Kenji Kawaguchi, Xiaokui Xiao

To answer this question, we theoretically study the concentration property of features obtained by neighborhood aggregation on homophilic and heterophilic graphs, introduce the single-pass augmentation-free graph contrastive learning loss based on the property, and provide performance guarantees for the minimizer of the loss on downstream tasks.

Contrastive Learning

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

1 code implementation27 Dec 2022 Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels.

Data Augmentation

Auxiliary Learning as an Asymmetric Bargaining Game

1 code implementation31 Jan 2023 Aviv Shamsian, Aviv Navon, Neta Glazer, Kenji Kawaguchi, Gal Chechik, Ethan Fetaya

Auxiliary learning is an effective method for enhancing the generalization capabilities of trained models, particularly when dealing with small datasets.

Auxiliary Learning

An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

no code implementations1 Mar 2023 Ravid Shwartz-Ziv, Randall Balestriero, Kenji Kawaguchi, Tim G. J. Rudner, Yann Lecun

In this paper, we provide an information-theoretic perspective on Variance-Invariance-Covariance Regularization (VICReg) for self-supervised learning.

Self-Supervised Learning Transfer Learning

Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks

2 code implementations8 Apr 2023 Yuzhen Mao, Zhun Deng, Huaxiu Yao, Ting Ye, Kenji Kawaguchi, James Zou

As machine learning has been deployed ubiquitously across applications in modern data science, algorithmic fairness has become a great concern.

Fairness Open-Ended Question Answering +1

Self-Evaluation Guided Beam Search for Reasoning

no code implementations NeurIPS 2023 Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie

Stochastic beam search balances exploitation and exploration of the search space with temperature-controlled randomness.

Arithmetic Reasoning GSM8K +3

Boosting Visual-Language Models by Exploiting Hard Samples

1 code implementation9 May 2023 Haonan Wang, Minbin Huang, Runhui Huang, Lanqing Hong, Hang Xu, Tianyang Hu, Xiaodan Liang, Zhenguo Li, Hong Cheng, Kenji Kawaguchi

In this work, we present HELIP, a cost-effective strategy tailored to enhance the performance of existing CLIP models without the need for training a model from scratch or collecting additional data.

Retrieval Zero-Shot Learning

Automatic Model Selection with Large Language Models for Reasoning

1 code implementation23 May 2023 James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie

Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths.

Arithmetic Reasoning GSM8K +4

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

1 code implementation NeurIPS 2023 Minki Kang, Seanie Lee, Jinheon Baek, Kenji Kawaguchi, Sung Ju Hwang

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge.

Memorization StrategyQA

How Does Information Bottleneck Help Deep Learning?

1 code implementation30 May 2023 Kenji Kawaguchi, Zhun Deng, Xu Ji, Jiaoyang Huang

In this paper, we provide the first rigorous learning theory for justifying the benefit of information bottleneck in deep learning by mathematically relating information bottleneck to generalization errors.

Generalization Bounds Learning Theory

Fast Diffusion Model

1 code implementation12 Jun 2023 Zike Wu, Pan Zhou, Kenji Kawaguchi, Hanwang Zhang

In this paper, we propose a Fast Diffusion Model (FDM) to significantly speed up DMs from a stochastic optimization perspective for both faster training and sampling.

Image Generation

Multi-View Class Incremental Learning

no code implementations16 Jun 2023 Depeng Li, Tianqi Wang, Junwei Chen, Kenji Kawaguchi, Cheng Lian, Zhigang Zeng

Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.

Class Incremental Learning Decision Making +3

Tackling the Curse of Dimensionality with Physics-Informed Neural Networks

no code implementations23 Jul 2023 Zheyuan Hu, Khemraj Shukla, George Em Karniadakis, Kenji Kawaguchi

We demonstrate in various diverse tests that the proposed method can solve many notoriously hard high-dimensional PDEs, including the Hamilton-Jacobi-Bellman (HJB) and the Schr\"{o}dinger equations in tens of thousands of dimensions very fast on a single GPU using the PINNs mesh-free approach.

A Dual-Perspective Approach to Evaluating Feature Attribution Methods

no code implementations17 Aug 2023 Yawei Li, Yang Zhang, Kenji Kawaguchi, Ashkan Khakzar, Bernd Bischl, Mina Rezaei

We apply these metrics to mainstream attribution methods, offering a novel lens through which to analyze and compare feature attribution methods.

On Copyright Risks of Text-to-Image Diffusion Models

no code implementations15 Sep 2023 Yang Zhang, Teoh Tze Tzun, Lim Wei Hern, Haonan Wang, Kenji Kawaguchi

Specifically, we introduce a data generation pipeline to systematically produce data for studying copyright in diffusion models.

Drug Discovery with Dynamic Goal-aware Fragments

no code implementations2 Oct 2023 Seul Lee, Seanie Lee, Kenji Kawaguchi, Sung Ju Hwang

Additionally, the existing fragment-based generative models cannot update the fragment vocabulary with goal-aware fragments newly discovered during the generation.

Drug Discovery

Self-Supervised Dataset Distillation for Transfer Learning

2 code implementations10 Oct 2023 Dong Bok Lee, Seanie Lee, Joonho Ko, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

To achieve this, we also introduce the MSE between representations of the inner model and the self-supervised target model on the original full dataset for outer optimization.

Bilevel Optimization Meta-Learning +3

Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules

1 code implementation NeurIPS 2023 Zhiyuan Liu, Yaorui Shi, An Zhang, Enzhi Zhang, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua

Our results show that a subgraph-level tokenizer and a sufficiently expressive decoder with remask decoding have a large impact on the encoder's representation learning.

Representation Learning Self-Supervised Learning

ChOiRe: Characterizing and Predicting Human Opinions with Chain of Opinion Reasoning

no code implementations14 Nov 2023 Xuan Long Do, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen

Aligning language models (LMs) with human opinion is challenging yet vital to enhance their grasp of human values, preferences, and beliefs.

Bias-Variance Trade-off in Physics-Informed Neural Networks with Randomized Smoothing for High-Dimensional PDEs

no code implementations26 Nov 2023 Zheyuan Hu, Zhouhao Yang, Yezhen Wang, George Em Karniadakis, Kenji Kawaguchi

To optimize the bias-variance trade-off, we combine the two approaches in a hybrid method that balances the rapid convergence of the biased version with the high accuracy of the unbiased version.

Computational Efficiency

VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models

1 code implementation29 Nov 2023 Xiang Li, Qianli Shen, Kenji Kawaguchi

The booming use of text-to-image generative models has raised concerns about their high risk of producing copyright-infringing content.

Prompt Optimization via Adversarial In-Context Learning

no code implementations5 Dec 2023 Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He

We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier.

Arithmetic Reasoning Data-to-Text Generation +2

Hutchinson Trace Estimation for High-Dimensional and High-Order Physics-Informed Neural Networks

1 code implementation22 Dec 2023 Zheyuan Hu, Zekun Shi, George Em Karniadakis, Kenji Kawaguchi

We further showcase HTE's convergence to the original PINN loss and its unbiased behavior under specific conditions.

Can AI Be as Creative as Humans?

no code implementations3 Jan 2024 Haonan Wang, James Zou, Michael Mozer, Anirudh Goyal, Alex Lamb, Linjun Zhang, Weijie J Su, Zhun Deng, Michael Qizhe Xie, Hannah Brown, Kenji Kawaguchi

With the rise of advanced generative AI models capable of tasks once reserved for human creativity, the study of AI's creative potential becomes imperative for its responsible development and application.

Simple Hierarchical Planning with Diffusion

no code implementations5 Jan 2024 Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn

Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets.

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

no code implementations7 Jan 2024 Haonan Wang, Qianli Shen, Yao Tong, Yang Zhang, Kenji Kawaguchi

This study explores the vulnerabilities associated with copyright protection in DMs by introducing a backdoor data poisoning attack (SilentBadDiffusion) against text-to-image diffusion models.

Data Poisoning Image Inpainting

Towards 3D Molecule-Text Interpretation in Language Models

1 code implementation25 Jan 2024 Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian

Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.

Instruction Following Language Modelling +3

Score-Based Physics-Informed Neural Networks for High-Dimensional Fokker-Planck Equations

no code implementations12 Feb 2024 Zheyuan Hu, Zhongqiang Zhang, George Em Karniadakis, Kenji Kawaguchi

The score function, defined as the gradient of the LL, plays a fundamental role in inferring LL and PDF and enables fast SDE sampling.

Unsupervised Concept Discovery Mitigates Spurious Correlations

no code implementations20 Feb 2024 Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi

Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases.

Representation Learning

The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling

no code implementations23 Feb 2024 Jiajun Ma, Shuchen Xue, Tianyang Hu, Wenjia Wang, Zhaoqiang Liu, Zhenguo Li, Zhi-Ming Ma, Kenji Kawaguchi

Surprisingly, the improvement persists when we increase the number of sampling steps and can even surpass the best result from EDM-2 (1. 58) with only 39 NFEs (1. 57).

Image Generation

AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter Merging

1 code implementation29 Feb 2024 Yiran Zhao, Wenxuan Zhang, Huiming Wang, Kenji Kawaguchi, Lidong Bing

In this paper, we acknowledge the mutual reliance between task ability and language ability and direct our attention toward the gap between the target language and the source language on tasks.

Cross-Lingual Transfer

How do Large Language Models Handle Multilingualism?

no code implementations29 Feb 2024 Yiran Zhao, Wenxuan Zhang, Guizhen Chen, Kenji Kawaguchi, Lidong Bing

We introduce a framework that depicts LLMs' processing of multilingual inputs: In the first several layers, LLMs understand the question, converting multilingual inputs into English to facilitate the task-solving phase.

Accelerating Greedy Coordinate Gradient via Probe Sampling

1 code implementation2 Mar 2024 Yiran Zhao, Wenyue Zheng, Tianle Cai, Xuan Long Do, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh

Safety of Large Language Models (LLMs) has become a central issue given their rapid progress and wide applications.

Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models

1 code implementation11 Mar 2024 Yang Zhang, Teoh Tze Tzun, Lim Wei Hern, Tiviatis Sim, Kenji Kawaguchi

Recent advancements in diffusion models have notably improved the perceptual quality of generated images in text-to-image synthesis tasks.

Image Generation

Towards Robust Out-of-Distribution Generalization Bounds via Sharpness

no code implementations11 Mar 2024 Yingtian Zou, Kenji Kawaguchi, Yingnan Liu, Jiashuo Liu, Mong-Li Lee, Wynne Hsu

To bridge this gap between optimization and OOD generalization, we study the effect of sharpness on how a model tolerates data change in domain shift which is usually captured by "robustness" in generalization.

Generalization Bounds Out-of-Distribution Generalization

Cannot find the paper you are looking for? You can Submit a new open access paper.