Search Results for author: Kangwook Lee

Found 46 papers, 28 papers with code

Debiasing Pre-Trained Language Models via Efficient Fine-Tuning

1 code implementation LTEDI (ACL) 2022 Michael Gira, Ruisu Zhang, Kangwook Lee

An explosion in the popularity of transformer-based language models (such as GPT-3, BERT, RoBERTa, and ALBERT) has opened the doors to new machine learning applications involving language modeling, text generation, and more.

Language Modelling Text Generation

Learning to Embed Multi-Modal Contexts for Situated Conversational Agents

no code implementations Findings (NAACL) 2022 Haeju Lee, Oh Joon Kwon, Yunseon Choi, Minho Park, Ran Han, Yoonhyung Kim, Jinhyeon Kim, Youngjune Lee, Haebin Shin, Kangwook Lee, Kee-Eung Kim

The Situated Interactive Multi-Modal Conversations (SIMMC) 2. 0 aims to create virtual shopping assistants that can accept complex multi-modal inputs, i. e. visual appearances of objects and user utterances.

coreference-resolution Decoder +4

Dual Operating Modes of In-Context Learning

1 code implementation29 Feb 2024 Ziqian Lin, Kangwook Lee

We introduce a probabilistic model, with which one can explain the dual operating modes of ICL simultaneously.

In-Context Learning Retrieval

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

2 code implementations6 Feb 2024 Jongho Park, Jaeseung Park, Zheyang Xiong, Nayoung Lee, Jaewoong Cho, Samet Oymak, Kangwook Lee, Dimitris Papailiopoulos

State-space models (SSMs), such as Mamba (Gu & Dao, 2023), have been proposed as alternatives to Transformer networks in language modeling, by incorporating gating, convolutions, and input-dependent token selection to mitigate the quadratic cost of multi-head attention.

In-Context Learning Language Modelling +1

Can MLLMs Perform Text-to-Image In-Context Learning?

1 code implementation2 Feb 2024 Yuchen Zeng, Wonjun Kang, Yicong Chen, Hyung Il Koo, Kangwook Lee

The evolution from Large Language Models (LLMs) to Multimodal Large Language Models (MLLMs) has spurred research into extending In-Context Learning (ICL) to its multimodal counterpart.

Image Generation In-Context Learning

Looped Transformers are Better at Learning Learning Algorithms

1 code implementation21 Nov 2023 Liu Yang, Kangwook Lee, Robert Nowak, Dimitris Papailiopoulos

Transformers have demonstrated effectiveness in in-context solving data-fitting problems from various (latent) models, as reported by Garg et al.

Image Clustering Conditioned on Text Criteria

1 code implementation27 Oct 2023 Sehyun Kwon, Jaeseung Park, Minkyu Kim, Jaewoong Cho, Ernest K. Ryu, Kangwook Lee

Classical clustering methods do not provide users with direct control of the clustering results, and the clustering results may not be consistent with the relevant criterion that a user has in mind.

Clustering Image Clustering

The Expressive Power of Low-Rank Adaptation

1 code implementation26 Oct 2023 Yuchen Zeng, Kangwook Lee

Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method that leverages low-rank adaptation of weight matrices, has emerged as a prevalent technique for fine-tuning pre-trained models such as large language models and diffusion models.

Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding

no code implementations12 Jul 2023 Seongjun Yang, Gibbeum Lee, Jaewoong Cho, Dimitris Papailiopoulos, Kangwook Lee

This paper presents "Predictive Pipelined Decoding (PPD)," an approach that speeds up greedy decoding in Large Language Models (LLMs) while maintaining the exact same output as the original decoding.

Teaching Arithmetic to Small Transformers

1 code implementation7 Jul 2023 Nayoung Lee, Kartik Sreenivasan, Jason D. Lee, Kangwook Lee, Dimitris Papailiopoulos

Even in the complete absence of pretraining, this approach significantly and simultaneously improves accuracy, sample complexity, and convergence speed.

Low-Rank Matrix Completion

Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression

1 code implementation25 May 2023 Joseph Shenouda, Rahul Parhi, Kangwook Lee, Robert D. Nowak

This representer theorem establishes that shallow vector-valued neural networks are the solutions to data-fitting problems over these infinite-dimensional spaces, where the network widths are bounded by the square of the number of training data.

Multi-Task Learning Neural Network Compression

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

2 code implementations25 May 2023 Ying Fan, Olivia Watkins, Yuqing Du, Hao liu, MoonKyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee

We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedback-trained reward.

reinforcement-learning Reinforcement Learning (RL)

Prompted LLMs as Chatbot Modules for Long Open-domain Conversation

1 code implementation8 May 2023 Gibbeum Lee, Volker Hartmann, Jongho Park, Dimitris Papailiopoulos, Kangwook Lee

In this paper, we propose MPC (Modular Prompted Chatbot), a new approach for creating high-quality conversational agents without the need for fine-tuning.


Improving Fair Training under Correlation Shifts

no code implementations5 Feb 2023 Yuji Roh, Kangwook Lee, Steven Euijong Whang, Changho Suh

First, we analytically show that existing in-processing fair algorithms have fundamental limits in accuracy and group fairness.


Optimizing DDPM Sampling with Shortcut Fine-Tuning

1 code implementation31 Jan 2023 Ying Fan, Kangwook Lee

In this study, we propose Shortcut Fine-Tuning (SFT), a new approach for addressing the challenge of fast sampling of pretrained Denoising Diffusion Probabilistic Models (DDPMs).

Denoising Reinforcement Learning (RL)

Looped Transformers as Programmable Computers

1 code implementation30 Jan 2023 Angeliki Giannou, Shashank Rajput, Jy-yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos

We present a framework for using transformer networks as universal computers by programming them with specific weights and placing them in a loop.

In-Context Learning

Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance

1 code implementation13 Dec 2022 Dohyun Kwon, Ying Fan, Kangwook Lee

Specifically, we prove that the Wasserstein distance is upper bounded by the square root of the objective function up to multiplicative constants and a fixed constant offset.

Audio Synthesis Image Generation

Outlier-Robust Group Inference via Gradient Space Clustering

1 code implementation13 Oct 2022 Yuchen Zeng, Kristjan Greenewald, Kangwook Lee, Justin Solomon, Mikhail Yurochkin

Traditional machine learning models focus on achieving good performance on the overall training distribution, but they often underperform on minority groups.


Equal Improvability: A New Fairness Notion Considering the Long-term Impact

1 code implementation13 Oct 2022 Ozgur Guldogan, Yuchen Zeng, Jy-yong Sohn, Ramtin Pedarsani, Kangwook Lee

In order to promote long-term fairness, we propose a new fairness notion called Equal Improvability (EI), which equalizes the potential acceptance rate of the rejected samples across different groups assuming a bounded level of effort will be spent by each rejected sample.


PathProx: A Proximal Gradient Algorithm for Weight Decay Regularized Deep Neural Networks

no code implementations6 Oct 2022 Liu Yang, Jifan Zhang, Joseph Shenouda, Dimitris Papailiopoulos, Kangwook Lee, Robert D. Nowak

Weight decay is one of the most widely used forms of regularization in deep learning, and has been shown to improve generalization and robustness.

LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks

1 code implementation14 Jun 2022 Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee

LIFT does not make any changes to the model architecture or loss function, and it solely relies on the natural language interface, enabling "no-code machine learning with LMs."

BIG-bench Machine Learning General Classification +2

Breaking Fair Binary Classification with Optimal Flipping Attacks

no code implementations12 Apr 2022 Changhun Jo, Jy-yong Sohn, Kangwook Lee

Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier.

Binary Classification Classification +2

Rare Gems: Finding Lottery Tickets at Initialization

1 code implementation24 Feb 2022 Kartik Sreenivasan, Jy-yong Sohn, Liu Yang, Matthew Grinde, Alliot Nagle, Hongyi Wang, Eric Xing, Kangwook Lee, Dimitris Papailiopoulos

Frankle & Carbin conjecture that we can avoid this by training "lottery tickets", i. e., special sparse subnetworks found at initialization, that can be trained to high accuracy.

Improved Input Reprogramming for GAN Conditioning

1 code implementation7 Jan 2022 Tuan Dinh, Daewon Seo, Zhixu Du, Liang Shang, Kangwook Lee

Motivated by real-world scenarios with scarce labeled data, we focus on the input reprogramming approach and carefully analyze the existing algorithm.

Improving Fairness via Federated Learning

2 code implementations29 Oct 2021 Yuchen Zeng, Hongxu Chen, Kangwook Lee

We then theoretically and empirically show that the performance tradeoff of FedAvg-based fair learning algorithms is strictly worse than that of a fair classifier trained on centralized data.

Fairness Federated Learning

Gradient Inversion with Generative Image Prior

1 code implementation NeurIPS 2021 Jinwoo Jeon, Jaechang Kim, Kangwook Lee, Sewoong Oh, Jungseul Ok

Federated Learning (FL) is a distributed learning framework, in which the local data never leaves clients devices to preserve privacy, and the server trains models on the data via accessing only the gradients of those local data.

Federated Learning

Coded-InvNet for Resilient Prediction Serving Systems

no code implementations11 Jun 2021 Tuan Dinh, Kangwook Lee

Inspired by a new coded computation algorithm for invertible functions, we propose Coded-InvNet a new approach to design resilient prediction serving systems that can gracefully handle stragglers or node failures.


Permutation-Based SGD: Is Random Optimal?

1 code implementation ICLR 2022 Shashank Rajput, Kangwook Lee, Dimitris Papailiopoulos

However, for general strongly convex functions, random permutations are optimal.

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

no code implementations EMNLP 2020 Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner.

Language Modelling Sentence

Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification

3 code implementations29 Oct 2020 Saurabh Agarwal, Hongyi Wang, Kangwook Lee, Shivaram Venkataraman, Dimitris Papailiopoulos

The techniques usually require choosing a static compression ratio, often requiring users to balance the trade-off between model accuracy and per-iteration speedup.


Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information

no code implementations16 Mar 2020 Changhun Jo, Kangwook Lee

Ahn et al. (2018) firstly characterized the optimal sample complexity in the presence of graph side information, but the results are limited due to strict, unrealistic assumptions made on the unknown latent preference matrix and the structure of user clusters.

Recommendation Systems

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training

1 code implementation ICML 2020 Yuji Roh, Kangwook Lee, Steven Euijong Whang, Changho Suh

Trustworthy AI is a critical issue in machine learning where, in addition to training a model that is accurate, one must consider both fair and robust training in the presence of data bias and poisoning.

Data Poisoning Fairness

FR-GAN: Fair and Robust Training

no code implementations25 Sep 2019 Yuji Roh, Kangwook Lee, Gyeong Jo Hwang, Steven Euijong Whang, Changho Suh

We consider the problem of fair and robust model training in the presence of data poisoning.

Attribute Data Poisoning +1

Binary Rating Estimation with Graph Side Information

no code implementations NeurIPS 2018 Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, Changho Suh

Considering a simple correlation model between a rating matrix and a graph, we characterize the sharp threshold on the number of observed entries required to recover the rating matrix (called the optimal sample complexity) as a function of the quality of graph side information (to be detailed).

Hypergraph Spectral Clustering in the Weighted Stochastic Block Model

no code implementations23 May 2018 Kwangjun Ahn, Kangwook Lee, Changho Suh

Our main contribution lies in performance analysis of the poly-time algorithms under a random hypergraph model, which we name the weighted stochastic block model, in which objects and multi-way measures are modeled as nodes and weights of hyperedges, respectively.

Clustering Stochastic Block Model

Simulated+Unsupervised Learning With Adaptive Data Generation and Bidirectional Mappings

no code implementations ICLR 2018 Kangwook Lee, Hoon Kim, Changho Suh

Recently, Shrivastava et al. (2017) propose Simulated+Unsupervised (S+U) learning: It first learns a mapping from synthetic data to real data, translates a large amount of labeled synthetic data to the ones that resemble real data, and then trains a learning model on the translated data.

Gaze Estimation

Community Recovery in Hypergraphs

no code implementations12 Sep 2017 Kwangjun Ahn, Kangwook Lee, Changho Suh

The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each of which is associated with the values of a certain number of data points.

Clustering Face Clustering +1

Speeding Up Distributed Machine Learning Using Codes

no code implementations8 Dec 2015 Kangwook Lee, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, Kannan Ramchandran

We focus on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.