2 code implementations • 25 May 2023 • Ying Fan, Olivia Watkins, Yuqing Du, Hao liu, MoonKyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee
We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedback-trained reward.
1 code implementation • 6 Feb 2024 • Jongho Park, Jaeseung Park, Zheyang Xiong, Nayoung Lee, Jaewoong Cho, Samet Oymak, Kangwook Lee, Dimitris Papailiopoulos
State-space models (SSMs), such as Mamba Gu & Dao (2034), have been proposed as alternatives to Transformer networks in language modeling, by incorporating gating, convolutions, and input-dependent token selection to mitigate the quadratic cost of multi-head attention.
3 code implementations • 29 Oct 2020 • Saurabh Agarwal, Hongyi Wang, Kangwook Lee, Shivaram Venkataraman, Dimitris Papailiopoulos
The techniques usually require choosing a static compression ratio, often requiring users to balance the trade-off between model accuracy and per-iteration speedup.
1 code implementation • 14 Jun 2022 • Tuan Dinh, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee
LIFT does not make any changes to the model architecture or loss function, and it solely relies on the natural language interface, enabling "no-code machine learning with LMs."
1 code implementation • 7 Jul 2023 • Nayoung Lee, Kartik Sreenivasan, Jason D. Lee, Kangwook Lee, Dimitris Papailiopoulos
Even in the complete absence of pretraining, this approach significantly and simultaneously improves accuracy, sample complexity, and convergence speed.
1 code implementation • 27 Oct 2023 • Sehyun Kwon, Jaeseung Park, Minkyu Kim, Jaewoong Cho, Ernest K. Ryu, Kangwook Lee
Classical clustering methods do not provide users with direct control of the clustering results, and the clustering results may not be consistent with the relevant criterion that a user has in mind.
2 code implementations • NeurIPS 2020 • Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, Dimitris Papailiopoulos
Due to its decentralized nature, Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training.
1 code implementation • NeurIPS 2021 • Jinwoo Jeon, Jaechang Kim, Kangwook Lee, Sewoong Oh, Jungseul Ok
Federated Learning (FL) is a distributed learning framework, in which the local data never leaves clients devices to preserve privacy, and the server trains models on the data via accessing only the gradients of those local data.
1 code implementation • 8 May 2023 • Gibbeum Lee, Volker Hartmann, Jongho Park, Dimitris Papailiopoulos, Kangwook Lee
In this paper, we propose MPC (Modular Prompted Chatbot), a new approach for creating high-quality conversational agents without the need for fine-tuning.
1 code implementation • 31 Jan 2023 • Ying Fan, Kangwook Lee
In this study, we propose Shortcut Fine-Tuning (SFT), a new approach for addressing the challenge of fast sampling of pretrained Denoising Diffusion Probabilistic Models (DDPMs).
1 code implementation • ICLR 2021 • Yuji Roh, Kangwook Lee, Steven Euijong Whang, Changho Suh
We address this problem via the lens of bilevel optimization.
1 code implementation • 13 Dec 2022 • Dohyun Kwon, Ying Fan, Kangwook Lee
Specifically, we prove that the Wasserstein distance is upper bounded by the square root of the objective function up to multiplicative constants and a fixed constant offset.
1 code implementation • ICML 2020 • Yuji Roh, Kangwook Lee, Steven Euijong Whang, Changho Suh
Trustworthy AI is a critical issue in machine learning where, in addition to training a model that is accurate, one must consider both fair and robust training in the presence of data bias and poisoning.
1 code implementation • 24 Feb 2022 • Kartik Sreenivasan, Jy-yong Sohn, Liu Yang, Matthew Grinde, Alliot Nagle, Hongyi Wang, Eric Xing, Kangwook Lee, Dimitris Papailiopoulos
Frankle & Carbin conjecture that we can avoid this by training "lottery tickets", i. e., special sparse subnetworks found at initialization, that can be trained to high accuracy.
1 code implementation • 26 Oct 2023 • Yuchen Zeng, Kangwook Lee
Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method that leverages low-rank adaptation of weight matrices, has emerged as a prevalent technique for fine-tuning pre-trained models such as large language models and diffusion models.
2 code implementations • 29 Oct 2021 • Yuchen Zeng, Hongxu Chen, Kangwook Lee
We then theoretically and empirically show that the performance tradeoff of FedAvg-based fair learning algorithms is strictly worse than that of a fair classifier trained on centralized data.
1 code implementation • 2 Feb 2024 • Yuchen Zeng, Wonjun Kang, Yicong Chen, Hyung Il Koo, Kangwook Lee
The evolution from Large Language Models (LLMs) to Multimodal Large Language Models (MLLMs) has spurred research into extending In-Context Learning (ICL) to its multimodal counterpart.
1 code implementation • 12 Jul 2023 • Jaewoong Cho, Kartik Sreenivasan, Keon Lee, Kyunghoo Mun, Soheun Yi, Jeong-Gwan Lee, Anna Lee, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee
Contrastive learning has gained significant attention as a method for self-supervised learning.
1 code implementation • 13 Oct 2022 • Ozgur Guldogan, Yuchen Zeng, Jy-yong Sohn, Ramtin Pedarsani, Kangwook Lee
In order to promote long-term fairness, we propose a new fairness notion called Equal Improvability (EI), which equalizes the potential acceptance rate of the rejected samples across different groups assuming a bounded level of effort will be spent by each rejected sample.
1 code implementation • 21 Nov 2023 • Liu Yang, Kangwook Lee, Robert Nowak, Dimitris Papailiopoulos
Transformers have demonstrated effectiveness in in-context solving data-fitting problems from various (latent) models, as reported by Garg et al.
1 code implementation • 7 Jan 2022 • Tuan Dinh, Daewon Seo, Zhixu Du, Liang Shang, Kangwook Lee
Motivated by real-world scenarios with scarce labeled data, we focus on the input reprogramming approach and carefully analyze the existing algorithm.
1 code implementation • LTEDI (ACL) 2022 • Michael Gira, Ruisu Zhang, Kangwook Lee
An explosion in the popularity of transformer-based language models (such as GPT-3, BERT, RoBERTa, and ALBERT) has opened the doors to new machine learning applications involving language modeling, text generation, and more.
1 code implementation • 30 Jan 2023 • Angeliki Giannou, Shashank Rajput, Jy-yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos
We present a framework for using transformer networks as universal computers by programming them with specific weights and placing them in a loop.
1 code implementation • 13 Oct 2022 • Yuchen Zeng, Kristjan Greenewald, Kangwook Lee, Justin Solomon, Mikhail Yurochkin
Traditional machine learning models focus on achieving good performance on the overall training distribution, but they often underperform on minority groups.
no code implementations • 23 May 2018 • Kwangjun Ahn, Kangwook Lee, Changho Suh
Our main contribution lies in performance analysis of the poly-time algorithms under a random hypergraph model, which we name the weighted stochastic block model, in which objects and multi-way measures are modeled as nodes and weights of hyperedges, respectively.
no code implementations • 8 Dec 2015 • Kangwook Lee, Maximilian Lam, Ramtin Pedarsani, Dimitris Papailiopoulos, Kannan Ramchandran
We focus on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling.
no code implementations • 12 Sep 2017 • Kwangjun Ahn, Kangwook Lee, Changho Suh
The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each of which is associated with the values of a certain number of data points.
no code implementations • NeurIPS 2018 • Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, Changho Suh
Considering a simple correlation model between a rating matrix and a graph, we characterize the sharp threshold on the number of observed entries required to recover the rating matrix (called the optimal sample complexity) as a function of the quality of graph side information (to be detailed).
no code implementations • ICLR 2018 • Kangwook Lee, Hoon Kim, Changho Suh
Recently, Shrivastava et al. (2017) propose Simulated+Unsupervised (S+U) learning: It first learns a mapping from synthetic data to real data, translates a large amount of labeled synthetic data to the ones that resemble real data, and then trains a learning model on the translated data.
no code implementations • 16 Mar 2020 • Changhun Jo, Kangwook Lee
Ahn et al. (2018) firstly characterized the optimal sample complexity in the presence of graph side information, but the results are limited due to strict, unrealistic assumptions made on the unknown latent preference matrix and the structure of user clusters.
no code implementations • EMNLP 2020 • Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner.
1 code implementation • ICLR 2022 • Shashank Rajput, Kangwook Lee, Dimitris Papailiopoulos
However, for general strongly convex functions, random permutations are optimal.
no code implementations • 11 Jun 2021 • Tuan Dinh, Kangwook Lee
Inspired by a new coded computation algorithm for invertible functions, we propose Coded-InvNet a new approach to design resilient prediction serving systems that can gracefully handle stragglers or node failures.
no code implementations • NeurIPS 2021 • Yuji Roh, Kangwook Lee, Steven Euijong Whang, Changho Suh
In this work, we propose a sample selection-based algorithm for fair and robust training.
no code implementations • 25 Sep 2019 • Yuji Roh, Kangwook Lee, Gyeong Jo Hwang, Steven Euijong Whang, Changho Suh
We consider the problem of fair and robust model training in the presence of data poisoning.
no code implementations • 7 Dec 2021 • Youngjune Lee, Oh Joon Kwon, Haeju Lee, Joonyoung Kim, Kangwook Lee, Kee-Eung Kim
For this reason, data-centric approaches are crucial for the automation of machine learning operation pipeline.
no code implementations • 7 Jan 2022 • Jy-yong Sohn, Liang Shang, Hongxu Chen, Jaekyun Moon, Dimitris Papailiopoulos, Kangwook Lee
Mixup is a data augmentation method that generates new data points by mixing a pair of input data.
no code implementations • 12 Apr 2022 • Changhun Jo, Jy-yong Sohn, Kangwook Lee
Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier.
1 code implementation • 23 May 2022 • Tuan Dinh, Jy-yong Sohn, Shashank Rajput, Timothy Ossowski, Yifei Ming, Junjie Hu, Dimitris Papailiopoulos, Kangwook Lee
Word translation without parallel corpora has become feasible, rivaling the performance of supervised methods.
no code implementations • Findings (NAACL) 2022 • Haeju Lee, Oh Joon Kwon, Yunseon Choi, Minho Park, Ran Han, Yoonhyung Kim, Jinhyeon Kim, Youngjune Lee, Haebin Shin, Kangwook Lee, Kee-Eung Kim
The Situated Interactive Multi-Modal Conversations (SIMMC) 2. 0 aims to create virtual shopping assistants that can accept complex multi-modal inputs, i. e. visual appearances of objects and user utterances.
Ranked #2 on Response Generation on SIMMC2.0
no code implementations • 6 Oct 2022 • Liu Yang, Jifan Zhang, Joseph Shenouda, Dimitris Papailiopoulos, Kangwook Lee, Robert D. Nowak
Weight decay is one of the most widely used forms of regularization in deep learning, and has been shown to improve generalization and robustness.
no code implementations • 5 Feb 2023 • Yuji Roh, Kangwook Lee, Steven Euijong Whang, Changho Suh
First, we analytically show that existing in-processing fair algorithms have fundamental limits in accuracy and group fairness.
1 code implementation • 25 May 2023 • Joseph Shenouda, Rahul Parhi, Kangwook Lee, Robert D. Nowak
This representer theorem establishes that shallow vector-valued neural networks are the solutions to data-fitting problems over these infinite-dimensional spaces, where the network widths are bounded by the square of the number of training data.
no code implementations • 12 Jul 2023 • Seongjun Yang, Gibbeum Lee, Jaewoong Cho, Dimitris Papailiopoulos, Kangwook Lee
This paper presents "Predictive Pipelined Decoding (PPD)," an approach that speeds up greedy decoding in Large Language Models (LLMs) while maintaining the exact same output as the original decoding.
no code implementations • 15 Jul 2023 • Joonyoung Kim, Kangwook Lee, Haebin Shin, Hurnjoo Lee, Sechun Kang, Byunguk Choi, Dong Shin, Joohyung Lee
The more new features that are being added to smartphones, the harder it becomes for users to find them.
no code implementations • 29 Feb 2024 • Ziqian Lin, Kangwook Lee
We introduce a probabilistic model, with which one can explain the dual operating modes of ICL simultaneously.