Search Results for author: Jaehong Yoon

Found 41 papers, 24 papers with code

Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization

no code implementations11 Apr 2025 Jialu Li, Shoubin Yu, Han Lin, Jaemin Cho, Jaehong Yoon, Mohit Bansal

Video-MSG consists of three steps, where in the first two steps, Video-MSG creates Video Sketch, a fine-grained spatio-temporal plan for the final video, specifying background, foreground, and object trajectories, in the form of draft video frames.

Denoising Object +3

RSQ: Learning from Important Tokens Leads to Better Quantized LLMs

1 code implementation3 Mar 2025 Yi-Lin Sung, Prateek Yadav, Jialu Li, Jaehong Yoon, Mohit Bansal

Building on this finding, we propose RSQ (Rotate, Scale, then Quantize), which (1) applies rotations (orthogonal transformation) to the model to mitigate outliers (those with exceptionally large magnitude), (2) scales the token feature based on its importance, and (3) quantizes the model using the GPTQ framework with the second-order statistics computed by scaled tokens.

Quantization

DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation

no code implementations25 Nov 2024 Zun Wang, Jialu Li, Han Lin, Jaehong Yoon, Mohit Bansal

To address these challenges, we propose DreamRunner, a novel story-to-video generation method: First, we structure the input script using a large language model (LLM) to facilitate both coarse-grained scene planning as well as fine-grained object-level layout and motion planning.

Large Language Model Motion Planning +4

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

no code implementations22 Nov 2024 Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal

In (2) refinement planning, we identify accurately generated objects and then create localized prompts to refine other areas in the video.

Text-to-Video Generation Video Alignment +1

SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation

1 code implementation16 Oct 2024 Jaehong Yoon, Shoubin Yu, Vaidehi Patil, Huaxiu Yao, Mohit Bansal

To address these, we propose SAFREE, a novel, training-free approach for safe T2I and T2V, that does not alter the model's weights.

Denoising Video Generation

Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection

1 code implementation14 Oct 2024 Adyasha Maharana, Jaehong Yoon, Tianlong Chen, Mohit Bansal

This data selector samples a subset of the most important samples from each skill cluster for training.

Glider: Global and Local Instruction-Driven Expert Router

1 code implementation9 Oct 2024 Pingzhi Li, Prateek Yadav, Jaehong Yoon, Jie Peng, Yi-Lin Sung, Mohit Bansal, Tianlong Chen

Our experiments using T5-based models for T0 and FLAN tasks demonstrate that GLIDER achieves substantially improved held-in performance while maintaining strong generalization on held-out tasks.

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

1 code implementation29 May 2024 Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin, Jaehong Yoon, Feng Cheng, Gedas Bertasius, Mohit Bansal

Specifically, we incorporate multigranularity information into a tree-based representation, allowing VideoTree to extract query-relevant details from long videos in a coarse-to-fine manner.

EgoSchema MME +3

RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives

2 code implementations28 May 2024 Jaehong Yoon, Shoubin Yu, Mohit Bansal

(3) RACCooN also plans to imagine new objects in a given video, so users simply prompt the model to receive a detailed video editing plan for complex video editing.

Attribute Video Editing

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

no code implementations18 Mar 2024 Abhay Zala, Jaemin Cho, Han Lin, Jaehong Yoon, Mohit Bansal

Then, we enable the LLM to continuously adapt the generated environments to progressively improve the skills that the agent is weak at, by providing feedback to the LLM in the form of the agent's performance.

Reinforcement Learning (RL) World Knowledge

SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

no code implementations11 Mar 2024 Jialu Li, Jaemin Cho, Yi-Lin Sung, Jaehong Yoon, Mohit Bansal

In this paper, we introduce SELMA: Skill-Specific Expert Learning and Merging with Auto-Generated Data, a novel paradigm to improve the faithfulness of T2I models by fine-tuning models on automatically generated, multi-skill image-text datasets, with skill-specific expert learning and merging.

In-Context Learning

BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation

no code implementations13 Feb 2024 Daeun Lee, Jaehong Yoon, Sung Ju Hwang

We validate that our method outperforms multiple CTTA scenarios, including disjoint and gradual domain shits, while only requiring ~98% fewer trainable parameters.

Test-time Adaptation

CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion

1 code implementation8 Feb 2024 Shoubin Yu, Jaehong Yoon, Mohit Bansal

Despite impressive advancements in recent multimodal reasoning approaches, they are still limited in flexibility and efficiency, as these models typically process only a few fixed modality inputs and require updates to numerous parameters.

Computational Efficiency Multimodal Reasoning +4

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

1 code implementation19 Jan 2024 Xiyao Wang, YuHang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang

However, current MLLM benchmarks are predominantly designed to evaluate reasoning based on static information about a single image, and the ability of modern MLLMs to extrapolate from image sequences, which is essential for understanding our ever-changing world, has been less investigated.

Language Modeling Language Modelling +2

Continual Learning: Forget-free Winning Subnetworks for Video Representations

2 code implementations19 Dec 2023 Haeyong Kang, Jaehong Yoon, Sung Ju Hwang, Chang D. Yoo

Inspired by the Lottery Ticket Hypothesis (LTH), which highlights the existence of efficient subnetworks within larger, dense networks, a high-performing Winning Subnetwork (WSN) in terms of task performance under appropriate sparsity conditions is considered for various continual learning tasks.

class-incremental learning Few-Shot Class-Incremental Learning +1

Multimodal Representation Learning by Alternating Unimodal Adaptation

no code implementations CVPR 2024 Xiaohui Zhang, Jaehong Yoon, Mohit Bansal, Huaxiu Yao

This optimization process is controlled by a gradient modification mechanism to prevent the shared head from losing previously acquired information.

Representation Learning

Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models

1 code implementation14 Nov 2023 Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, Se-Young Yun

The dynamic nature of knowledge in an ever-changing world presents challenges for language models trained on static data; the model in the real world often requires not only acquiring new knowledge but also overwriting outdated information into updated ones.

Continual Learning Question Answering +1

STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment

no code implementations12 Oct 2023 Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang

Continuously learning a variety of audio-video semantics over time is crucial for audio-related reasoning tasks in our ever-evolving world.

Continual Learning Representation Learning +1

ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models

no code implementations4 Oct 2023 Yi-Lin Sung, Jaehong Yoon, Mohit Bansal

We first determine the sparsity ratios of different layers or blocks by leveraging the global importance score, which is efficiently computed based on the zeroth-order approximation of the global model gradients.

Model Compression

Continual Learners are Incremental Model Generalizers

no code implementations21 Jun 2023 Jaehong Yoon, Sung Ju Hwang, Yue Cao

We believe this paper breaks the barriers between pre-training and fine-tuning steps and leads to a sustainable learning framework in which the continual learner incrementally improves model generalization, yielding better transfer to unseen tasks.

Continual Learning model

Progressive Fourier Neural Representation for Sequential Video Compilation

2 code implementations20 Jun 2023 Haeyong Kang, Jaehong Yoon, Dahyun Kim, Sung Ju Hwang, Chang D Yoo

Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.

Continual Learning

Forget-free Continual Learning with Soft-Winning SubNetworks

1 code implementation27 Mar 2023 Haeyong Kang, Jaehong Yoon, Sultan Rizky Madjid, Sung Ju Hwang, Chang D. Yoo

Inspired by Regularized Lottery Ticket Hypothesis (RLTH), which states that competitive smooth (non-binary) subnetworks exist within a dense network in continual learning tasks, we investigate two proposed architecture-based continual learning methods which sequentially learn and select adaptive binary- (WSN) and non-binary Soft-Subnetworks (SoftNet) for each task.

class-incremental learning Few-Shot Class-Incremental Learning +1

On the Soft-Subnetwork for Few-shot Class Incremental Learning

2 code implementations15 Sep 2022 Haeyong Kang, Jaehong Yoon, Sultan Rizky Hikmawan Madjid, Sung Ju Hwang, Chang D. Yoo

Inspired by Regularized Lottery Ticket Hypothesis (RLTH), which hypothesizes that there exist smooth (non-binary) subnetworks within a dense network that achieve the competitive performance of the dense network, we propose a few-shot class incremental learning (FSCIL) method referred to as \emph{Soft-SubNetworks (SoftNet)}.

class-incremental learning Few-Shot Class-Incremental Learning +1

BiTAT: Neural Network Binarization with Task-dependent Aggregated Transformation

no code implementations4 Jul 2022 Geon Park, Jaehong Yoon, Haiyang Zhang, Xing Zhang, Sung Ju Hwang, Yonina C. Eldar

Neural network quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation, while preserving the performance of the original model.

Binarization Quantization

Forget-free Continual Learning with Winning Subnetworks

1 code implementation International Conference on Machine Learning 2022 Haeyong Kang, Rusty John Lloyd Mina, Sultan Rizky Hikmawan Madjid, Jaehong Yoon, Mark Hasegawa-Johnson, Sung Ju Hwang, Chang D. Yoo

Inspired by Lottery Ticket Hypothesis that competitive subnetworks exist within a dense network, we propose a continual learning method referred to as Winning SubNetworks (WSN), which sequentially learns and selects an optimal subnetwork for each task.

Continual Learning

Personalized Subgraph Federated Learning

1 code implementation21 Jun 2022 Jinheon Baek, Wonyong Jeong, Jiongdao Jin, Jaehong Yoon, Sung Ju Hwang

To this end, we introduce a new subgraph FL problem, personalized subgraph FL, which focuses on the joint improvement of the interrelated local GNNs rather than learning a single global model, and propose a novel framework, FEDerated Personalized sUBgraph learning (FED-PUB), to tackle it.

Federated Learning

Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization

no code implementations23 Feb 2022 Jaehong Yoon, Geon Park, Wonyong Jeong, Sung Ju Hwang

We introduce a pragmatic FL scenario with bitwidth heterogeneity across the participating devices, dubbed as Bitwidth Heterogeneous Federated Learning (BHFL).

Federated Learning

Representational Continuity for Unsupervised Continual Learning

1 code implementation ICLR 2022 Divyam Madaan, Jaehong Yoon, Yuanchun Li, Yunxin Liu, Sung Ju Hwang

Continual learning (CL) aims to learn a sequence of tasks without forgetting the previously acquired knowledge.

Continual Learning

Online Coreset Selection for Rehearsal-based Continual Learning

no code implementations ICLR 2022 Jaehong Yoon, Divyam Madaan, Eunho Yang, Sung Ju Hwang

We validate the effectiveness of our coreset selection mechanism over various standard, imbalanced, and noisy datasets against strong continual learning baselines, demonstrating that it improves task adaptation and prevents catastrophic forgetting in a sample-efficient manner.

Continual Learning

Rapid Neural Pruning for Novel Datasets with Set-based Task-Adaptive Meta-Pruning

no code implementations1 Jan 2021 Minyoung Song, Jaehong Yoon, Eunho Yang, Sung Ju Hwang

As deep neural networks are growing in size and being increasingly deployed to more resource-limited devices, there has been a recent surge of interest in network pruning methods, which aim to remove less important weights or activations of a given network.

Cloud Computing Network Pruning

Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning

1 code implementation ICLR 2021 Wonyong Jeong, Jaehong Yoon, Eunho Yang, Sung Ju Hwang

Through extensive experimental validation of our method in the two different scenarios, we show that our method outperforms both local semi-supervised learning and baselines which naively combine federated learning with semi-supervised learning.

Federated Learning

Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning

no code implementations22 Jun 2020 Minyoung Song, Jaehong Yoon, Eunho Yang, Sung Ju Hwang

As deep neural networks are growing in size and being increasingly deployed to more resource-limited devices, there has been a recent surge of interest in network pruning methods, which aim to remove less important weights or activations of a given network.

Cloud Computing Network Pruning

Federated Continual Learning with Weighted Inter-client Transfer

1 code implementation6 Mar 2020 Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, Sung Ju Hwang

There has been a surge of interest in continual learning and federated learning, both of which are important in deep neural networks in real-world scenarios.

Continual Learning Federated Learning +1

Scalable and Order-robust Continual Learning with Additive Parameter Decomposition

1 code implementation ICLR 2020 Jaehong Yoon, Saehoon Kim, Eunho Yang, Sung Ju Hwang

First, a continual learning model should effectively handle catastrophic forgetting and be efficient to train even with a large number of tasks.

Continual Learning Fairness +1

ADAPTIVE NETWORK SPARSIFICATION VIA DEPENDENT VARIATIONAL BETA-BERNOULLI DROPOUT

no code implementations27 Sep 2018 Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang

With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss.

Adaptive Network Sparsification with Dependent Variational Beta-Bernoulli Dropout

1 code implementation28 May 2018 Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang

With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss.

Lifelong Learning with Dynamically Expandable Networks

3 code implementations ICLR 2018 Jaehong Yoon, Eunho Yang, Jeongtae Lee, Sung Ju Hwang

We propose a novel deep network architecture for lifelong learning which we refer to as Dynamically Expandable Network (DEN), that can dynamically decide its network capacity as it trains on a sequence of tasks, to learn a compact overlapping knowledge sharing structure among tasks.

Lifelong learning

Combined Group and Exclusive Sparsity for Deep Neural Networks

1 code implementation ICML 2017 Jaehong Yoon, Sung Ju Hwang

The number of parameters in a deep neural network is usually very large, which helps with its learning capacity but also hinders its scalability and practicality due to memory/time inefficiency and overfitting.

Cannot find the paper you are looking for? You can Submit a new open access paper.