Search Results for author: Juho Lee

Found 72 papers, 33 papers with code

Stochastic Optimal Control for Diffusion Bridges in Function Spaces

no code implementations31 May 2024 Byoungwoo Park, JungWon Choi, Sungbin Lim, Juho Lee

In this paper, we present a theory of stochastic optimal control (SOC) tailored to infinite-dimensional spaces, aiming to extend diffusion-based algorithms to function spaces.

Time Series

Learning diverse attacks on large language models for robust red-teaming and safety tuning

no code implementations28 May 2024 Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain

Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs).

Diversity Language Modelling

Fast Ensembling with Diffusion Schrödinger Bridge

1 code implementation24 Apr 2024 Hyunsu Kim, Jongmin Yoon, Juho Lee

Deep Ensemble (DE) approach is a straightforward technique used to enhance the performance of deep neural networks by training them from different initial points, converging towards various local optima.

Lipsum-FT: Robust Fine-Tuning of Zero-Shot Models Using Random Text Guidance

1 code implementation1 Apr 2024 Giung Nam, Byeongho Heo, Juho Lee

Large-scale contrastive vision-language pre-trained models provide the zero-shot model achieving competitive performance across a range of image classification tasks without requiring training on downstream data.

Image Classification Language Modelling

Enhancing Transfer Learning with Flexible Nonparametric Posterior Sampling

no code implementations12 Mar 2024 Hyungi Lee, Giung Nam, Edwin Fong, Juho Lee

The nonparametric learning (NPL) method is a recent approach that employs a nonparametric prior for posterior sampling, efficiently accounting for model misspecification scenarios, which is suitable for transfer learning scenarios that may involve the distribution shift between upstream and downstream tasks.

Transfer Learning

Joint-Embedding Masked Autoencoder for Self-supervised Learning of Dynamic Functional Connectivity from the Human Brain

no code implementations11 Mar 2024 JungWon Choi, Hyungi Lee, Byung-Hoon Kim, Juho Lee

Although generative self-supervised learning techniques, especially masked autoencoders, have shown promising results in representation learning in various domains, their application to dynamic graphs for dynamic functional connectivity remains underexplored, facing challenges in capturing high-level semantic representations.

Representation Learning Self-Supervised Learning

A Scalable and Transferable Time Series Prediction Framework for Demand Forecasting

no code implementations29 Feb 2024 Young-Jin Park, Donghyun Kim, Frédéric Odermatt, Juho Lee, Kyung-Min Kim

Time series forecasting is one of the most essential and ubiquitous tasks in many business problems, including demand forecasting and logistics optimization.

Time Series Time Series Forecasting +1

Sequential Flow Straightening for Generative Modeling

no code implementations9 Feb 2024 Jongmin Yoon, Juho Lee

Straightening the probability flow of the continuous-time generative models, such as diffusion models or flow-based models, is the key to fast sampling through the numerical solvers, existing methods learn a linear path by directly generating the probability path the joint distribution between the noise and data distribution.

A Generative Self-Supervised Framework using Functional Connectivity in fMRI Data

no code implementations4 Dec 2023 JungWon Choi, Seongho Keum, Eunggu Yun, Byung-Hoon Kim, Juho Lee

Deep neural networks trained on Functional Connectivity (FC) networks extracted from functional Magnetic Resonance Imaging (fMRI) data have gained popularity due to the increasing availability of data and advances in model architectures, including Graph Neural Network (GNN).

Graph Neural Network Self-Supervised Learning

Slot-Mixup with Subsampling: A Simple Regularization for WSI Classification

no code implementations29 Nov 2023 Seongho Keum, Sanghyun Kim, Soojeong Lee, Juho Lee

Due to the lack of patch-level labels, multiple instance learning (MIL) is a common practice for training a WSI classifier.

Multiple Instance Learning

Self-Supervised Dataset Distillation for Transfer Learning

2 code implementations10 Oct 2023 Dong Bok Lee, Seanie Lee, Joonho Ko, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

To achieve this, we also introduce the MSE between representations of the inner model and the self-supervised target model on the original full dataset for outer optimization.

Bilevel Optimization Dataset Distillation +4

Spear and Shield: Adversarial Attacks and Defense Methods for Model-Based Link Prediction on Continuous-Time Dynamic Graphs

1 code implementation21 Aug 2023 Dongjin Lee, Juho Lee, Kijung Shin

Specifically, before the training procedure of a victim model, which is a TGNN for link prediction, we inject edge perturbations to the data that are unnoticeable in terms of the four constraints we propose, and yet effective enough to cause malfunction of the victim model.

Adversarial Attack Link Prediction

Probabilistic Imputation for Time-series Classification with Missing Data

1 code implementation13 Aug 2023 SeungHyun Kim, Hyunsu Kim, Eunggu Yun, Hwangrae Lee, Jaehun Lee, Juho Lee

In this paper, we propose a novel probabilistic framework for classification with multivariate time series data with missing values.

Imputation Time Series +1

Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models

1 code implementation12 Jul 2023 Sanghyun Kim, Seohyeon Jung, Balhae Kim, Moonseok Choi, Jinwoo Shin, Juho Lee

Large-scale image generation models, with impressive quality made possible by the vast amount of data available on the Internet, raise social concerns that these models may generate harmful or copyrighted content.

Image Generation

Traversing Between Modes in Function Space for Fast Ensembling

1 code implementation20 Jun 2023 Eunggu Yun, Hyungi Lee, Giung Nam, Juho Lee

While this provides a way to efficiently train ensembles, for inference, multiple forward passes should still be executed using all the ensemble parameters, which often becomes a serious bottleneck for real-world deployment.

Regularizing Towards Soft Equivariance Under Mixed Symmetries

no code implementations1 Jun 2023 Hyunsu Kim, Hyungi Lee, Hongseok Yang, Juho Lee

The key component of our method is what we call equivariance regularizer for a given type of symmetries, which measures how much a model is equivariant with respect to the symmetries of the type.

Motion Forecasting

Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning

no code implementations24 May 2023 Moonseok Choi, Hyungi Lee, Giung Nam, Juho Lee

Given the ever-increasing size of modern neural networks, the significance of sparse architectures has surged due to their accelerated inference speeds and minimal memory demands.

Decoupled Training for Long-Tailed Classification With Stochastic Representations

no code implementations19 Apr 2023 Giung Nam, Sunguk Jang, Juho Lee

Decoupling representation learning and classifier learning has been shown to be effective in classification with long-tailed data.

Classification Representation Learning

Martingale Posterior Neural Processes

no code implementations19 Apr 2023 Hyungi Lee, Eunggu Yun, Giung Nam, Edwin Fong, Juho Lee

Based on this result, instead of assuming any form of the latent variables, we equip a NP with a predictive distribution implicitly defined with neural networks and use the corresponding martingale posteriors as the source of uncertainty.

Bayesian Inference Gaussian Processes

Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning

1 code implementation2 Feb 2023 Francois Caron, Fadhel Ayed, Paul Jung, Hoil Lee, Juho Lee, Hongseok Yang

We consider the optimisation of large and shallow neural networks via gradient flow, where the output of each hidden node is scaled by some positive parameter.

Transfer Learning

On Divergence Measures for Bayesian Pseudocoresets

1 code implementation12 Oct 2022 Balhae Kim, JungWon Choi, Seanie Lee, Yoonho Lee, Jung-Woo Ha, Juho Lee

Finally, we propose a novel Bayesian pseudocoreset algorithm based on minimizing forward KL divergence.

Bayesian Inference Dataset Distillation +1

Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders

1 code implementation5 Oct 2022 Youngwan Lee, Jeffrey Willette, Jonghee Kim, Juho Lee, Sung Ju Hwang

Masked image modeling (MIM) has become a popular strategy for self-supervised learning~(SSL) of visual representations with Vision Transformers.

Classification Instance Segmentation +4

Self-Distillation for Further Pre-training of Transformers

no code implementations30 Sep 2022 Seanie Lee, Minki Kang, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi

Pre-training a large transformer model on a massive amount of unlabeled data and fine-tuning it on labeled datasets for diverse downstream tasks has proven to be a successful strategy, for a variety of vision and natural language processing tasks.

text-classification Text Classification

Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation

1 code implementation26 Aug 2022 Jeffrey Willette, Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Recent work on mini-batch consistency (MBC) for set functions has brought attention to the need for sequentially processing and aggregating chunks of a partitioned set while guaranteeing the same output for all partitions.

Point Cloud Classification text-classification +1

Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation

1 code implementation30 Jun 2022 Giung Nam, Hyungi Lee, Byeongho Heo, Juho Lee

Ensembles of deep neural networks have demonstrated superior performance, but their heavy computational cost hinders applying them for resource-limited environments.

Diversity Image Classification

Set-based Meta-Interpolation for Few-Task Meta-Learning

no code implementations20 May 2022 Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Recently, several task augmentation methods have been proposed to tackle this issue using domain-specific knowledge to design augmentation techniques to densify the meta-training task distribution.

Bilevel Optimization Image Classification +6

Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility

1 code implementation17 May 2022 Hoil Lee, Fadhel Ayed, Paul Jung, Juho Lee, Hongseok Yang, François Caron

Under this model, we show that each layer of the infinite-width neural network can be characterised by two simple quantities: a non-negative scalar parameter and a L\'evy measure on the positive reals.

Gaussian Processes Representation Learning

PolarDenseNet: A Deep Learning Model for CSI Feedback in MIMO Systems

no code implementations2 Feb 2022 Pranav Madadi, Jeongho Jeon, Joonyoung Cho, Caleb Lo, Juho Lee, Jianzhong Zhang

In multiple-input multiple-output (MIMO) systems, the high-resolution channel information (CSI) is required at the base station (BS) to ensure optimal performance, especially in the case of multi-user MIMO (MU-MIMO) systems.

Diversity Matters When Learning From Ensembles

no code implementations NeurIPS 2021 Giung Nam, Jongmin Yoon, Yoonho Lee, Juho Lee

We propose a simple approach for reducing this gap, i. e., making the distilled performance close to the full ensemble.

Diversity Image Classification

Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty

no code implementations12 Oct 2021 Jeffrey Willette, Hae Beom Lee, Juho Lee, Sung Ju Hwang

Numerous recent works utilize bi-Lipschitz regularization of neural network layers to preserve relative distances between data instances in the feature spaces of each layer.

Meta-Learning Out of Distribution (OOD) Detection

Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning

no code implementations ICLR 2022 Seanie Lee, Hae Beom Lee, Juho Lee, Sung Ju Hwang

Thanks to the gradients aligned between tasks by our method, the model becomes less vulnerable to negative transfer and catastrophic forgetting.

Continual Learning Multi-Task Learning +1

Task Conditioned Stochastic Subsampling

no code implementations29 Sep 2021 Andreis Bruno, Seanie Lee, A. Tuan Nguyen, Juho Lee, Eunho Yang, Sung Ju Hwang

Deep Learning algorithms are designed to operate on huge volumes of high dimensional data such as images.

Image Classification Image Reconstruction

Meta Learning Low Rank Covariance Factors for Energy Based Deterministic Uncertainty

no code implementations ICLR 2022 Jeffrey Ryan Willette, Hae Beom Lee, Juho Lee, Sung Ju Hwang

Numerous recent works utilize bi-Lipschitz regularization of neural network layers to preserve relative distances between data instances in the feature spaces of each layer.

Meta-Learning Out of Distribution (OOD) Detection

Assumption-Free Survival Analysis Under Local Smoothness Prior

no code implementations29 Sep 2021 Seungjae Jung, Min-Kyu Kim, Juho Lee, Young-Jin Park, Nahyeon Park, Kyung-Min Kim

Survival analysis appears in various fields such as medicine, economics, engineering, and business.

Survival Analysis

Scale Mixtures of Neural Network Gaussian Processes

1 code implementation ICLR 2022 Hyungi Lee, Eunggu Yun, Hongseok Yang, Juho Lee

We show that simply introducing a scale prior on the last-layer parameters can turn infinitely-wide neural networks of any architecture into a richer class of stochastic processes.

Gaussian Processes

Adversarial purification with Score-based generative models

1 code implementation11 Jun 2021 Jongmin Yoon, Sung Ju Hwang, Juho Lee

Recently, an Energy-Based Model (EBM) trained with Markov-Chain Monte-Carlo (MCMC) has been highlighted as a purification model, where an attacked image is purified by running a long Markov-chain using the gradients of the EBM.

Adversarial Purification Denoising

Learning to Pool in Graph Neural Networks for Extrapolation

no code implementations11 Jun 2021 Jihoon Ko, Taehyung Kwon, Kijung Shin, Juho Lee

However, according to a recent study, a careful choice of pooling functions, which are used for the aggregation and readout operations in GNNs, is crucial for enabling GNNs to extrapolate.

Hybrid Generative-Contrastive Representation Learning

1 code implementation11 Jun 2021 Saehoon Kim, Sungwoong Kim, Juho Lee

On the other hand, the generative pre-training directly estimates the data distribution, so the representations tend to be robust but not optimal for discriminative tasks.

Contrastive Learning Decoder +1

Learning to Perturb Word Embeddings for Out-of-distribution QA

1 code implementation ACL 2021 Seanie Lee, Minki Kang, Juho Lee, Sung Ju Hwang

QA models based on pretrained language mod-els have achieved remarkable performance on various benchmark datasets. However, QA models do not generalize well to unseen data that falls outside the training distribution, due to distributional shifts. Data augmentation (DA) techniques which drop/replace words have shown to be effective in regularizing the model from overfitting to the training data. Yet, they may adversely affect the QA tasks since they incur semantic changes that may lead to wrong answers for the QA task.

Data Augmentation Domain Generalization +1

SetVAE: Learning Hierarchical Composition for Generative Modeling of Set-Structured Data

2 code implementations CVPR 2021 Jinwoo Kim, Jaehoon Yoo, Juho Lee, Seunghoon Hong

Generative modeling of set-structured data, such as point clouds, requires reasoning over local and global structures at various scales.

Point Cloud Generation

Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding

no code implementations NeurIPS 2021 Bruno Andreis, Jeffrey Willette, Juho Lee, Sung Ju Hwang

The proposed method adheres to the required symmetries of invariance and equivariance as well as maintaining MBC for any partition of the input set.

Improving Uncertainty Calibration via Prior Augmented Data

no code implementations22 Feb 2021 Jeffrey Willette, Juho Lee, Sung Ju Hwang

Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.

A Multi-Mode Modulator for Multi-Domain Few-Shot Classification

1 code implementation ICCV 2021 Yanbin Liu, Juho Lee, Linchao Zhu, Ling Chen, Humphrey Shi, Yi Yang

Most existing few-shot classification methods only consider generalization on one dataset (i. e., single-domain), failing to transfer across various seen and unseen domains.

Classification Domain Generalization

Improving Neural Network Accuracy and Calibration Under Distributional Shift with Prior Augmented Data

no code implementations1 Jan 2021 Jeffrey Ryan Willette, Juho Lee, Sung Ju Hwang

We demonstrate the effectiveness of our method and validate its performance on both classification and regression problems by applying it to the training of recent state-of-the-art neural network models.

Adaptive Strategy for Resetting a Non-stationary Markov Chain during Learning via Joint Stochastic Approximation

no code implementations pproximateinference AABI Symposium 2021 Hyunsu Kim, Juho Lee, Hongseok Yang

The non-stationary kernel problem refers to the degraded performance of the algorithm due to the constant change of the transition kernel of the chain throughout the run of the algorithm.

Amortized Probabilistic Detection of Communities in Graphs

2 code implementations29 Oct 2020 Yueqi Wang, Yoonho Lee, Pallab Basu, Juho Lee, Yee Whye Teh, Liam Paninski, Ari Pakman

While graph neural networks (GNNs) have been successful in encoding graph structures, existing GNN-based methods for community detection are limited by requiring knowledge of the number of communities in advance, in addition to lacking a proper probabilistic formulation to handle uncertainty.

Clustering Community Detection

Neural Complexity Measures

1 code implementation NeurIPS 2020 Yoonho Lee, Juho Lee, Sung Ju Hwang, Eunho Yang, Seungjin Choi

While various complexity measures for deep neural networks exist, specifying an appropriate measure capable of predicting and explaining generalization in deep networks has proven challenging.

Meta-Learning regression

Bootstrapping Neural Processes

1 code implementation NeurIPS 2020 Juho Lee, Yoonho Lee, Jungtaek Kim, Eunho Yang, Sung Ju Hwang, Yee Whye Teh

While this "data-driven" way of learning stochastic processes has proven to handle various types of data, NPs still rely on an assumption that uncertainty in stochastic processes is modeled by a single latent variable, which potentially limits the flexibility.

Set Based Stochastic Subsampling

no code implementations25 Jun 2020 Bruno Andreis, Seanie Lee, A. Tuan Nguyen, Juho Lee, Eunho Yang, Sung Ju Hwang

Deep models are designed to operate on huge volumes of high dimensional data such as images.

feature selection Image Classification +2

Graph Embedding VAE: A Permutation Invariant Model of Graph Structure

no code implementations17 Oct 2019 Tony Duan, Juho Lee

Generative models of graph structure have applications in biology and social sciences.

Graph Embedding Graph Generation

Deep Amortized Clustering

no code implementations ICLR 2020 Juho Lee, Yoonho Lee, Yee Whye Teh

We propose a deep amortized clustering (DAC), a neural architecture which learns to cluster datasets efficiently using a few forward passes.

Clustering

A unified construction for series representations and finite approximations of completely random measures

no code implementations26 May 2019 Juho Lee, Xenia Miscouridou, François Caron

In particular, we show that one can get novel series representations for the generalized gamma process and the stable beta process.

Clustering Density Estimation +1

Beyond the Chinese Restaurant and Pitman-Yor processes: Statistical Models with Double Power-law Behavior

1 code implementation13 Feb 2019 Fadhel Ayed, Juho Lee, François Caron

Bayesian nonparametric approaches, in particular the Pitman-Yor process and the associated two-parameter Chinese Restaurant process, have been successfully used in applications where the data exhibit a power-law behavior.

A Bayesian model for sparse graphs with flexible degree distribution and overlapping community structure

1 code implementation3 Oct 2018 Juho Lee, Lancelot F. James, Seungjin Choi, François Caron

We consider a non-projective class of inhomogeneous random graph models with interpretable parameters and a number of interesting asymptotic properties.

Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks

9 code implementations1 Oct 2018 Juho Lee, Yoonho Lee, Jungtaek Kim, Adam R. Kosiorek, Seungjin Choi, Yee Whye Teh

Many machine learning tasks such as multiple instance learning, 3D shape recognition, and few-shot image classification are defined on sets of instances.

3D Shape Recognition Decoder +2

ADAPTIVE NETWORK SPARSIFICATION VIA DEPENDENT VARIATIONAL BETA-BERNOULLI DROPOUT

no code implementations27 Sep 2018 Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang

With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss.

Deep Mixed Effect Model using Gaussian Processes: A Personalized and Reliable Prediction for Healthcare

2 code implementations5 Jun 2018 Ingyo Chung, Saehoon Kim, Juho Lee, Kwang Joon Kim, Sung Ju Hwang, Eunho Yang

We present a personalized and reliable prediction model for healthcare, which can provide individually tailored medical services such as diagnosis, disease treatment, and prevention.

Gaussian Processes Time Series +1

Adaptive Network Sparsification with Dependent Variational Beta-Bernoulli Dropout

1 code implementation28 May 2018 Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang

With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss.

DropMax: Adaptive Stochastic Softmax

no code implementations ICLR 2018 Hae Beom Lee, Juho Lee, Eunho Yang, Sung Ju Hwang

Moreover, the learning of dropout probabilities for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes.

Classification General Classification +1

DropMax: Adaptive Variational Softmax

4 code implementations NeurIPS 2018 Hae Beom Lee, Juho Lee, Saehoon Kim, Eunho Yang, Sung Ju Hwang

Moreover, the learning of dropout rates for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes.

Classification General Classification +1

Bayesian inference on random simple graphs with power law degree distributions

no code implementations ICML 2017 Juho Lee, Creighton Heaukulani, Zoubin Ghahramani, Lancelot F. James, Seungjin Choi

The BFRY random variables are well approximated by gamma random variables in a variational Bayesian inference routine, which we apply to several network datasets for which power law degree distributions are a natural assumption.

Bayesian Inference

Finite-Dimensional BFRY Priors and Variational Bayesian Inference for Power Law Models

no code implementations NeurIPS 2016 Juho Lee, Lancelot F. James, Seungjin Choi

Bayesian nonparametric methods based on the Dirichlet process (DP), gamma process and beta process, have proven effective in capturing aspects of various datasets arising in machine learning.

Bayesian Inference

Tree-Guided MCMC Inference for Normalized Random Measure Mixture Models

no code implementations NeurIPS 2015 Juho Lee, Seungjin Choi

Normalized random measures (NRMs) provide a broad class of discrete random measures that are often used as priors for Bayesian nonparametric models.

Clustering

Bayesian Hierarchical Clustering with Exponential Family: Small-Variance Asymptotics and Reducibility

no code implementations29 Jan 2015 Juho Lee, Seungjin Choi

Bayesian hierarchical clustering (BHC) is an agglomerative clustering method, where a probabilistic model is defined and its marginal likelihoods are evaluated to decide which clusters to merge.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.