no code implementations • 2 May 2025 • Yeonsang Shin, JiHwan Kim, Yumin Song, Kyungseung Lee, Hyunhee Chung, Taeyoung Na
Despite the remarkable progress in text-to-video models, achieving precise control over text elements and animated graphics remains a significant challenge, especially in applications such as video advertisements.
no code implementations • 29 Apr 2025 • Alex Mathai, Chenxi Huang, Suwei Ma, JiHwan Kim, Hailie Mitchell, Aleksandr Nogikh, Petros Maniatis, Franjo Ivančić, Junfeng Yang, Baishakhi Ray
In this work, we build upon kGym, which shares a benchmark for system-level Linux kernel bugs and a platform to run experiments on the Linux kernel.
1 code implementation • 3 Nov 2024 • Miso Lee, JiHwan Kim, Jae-Pil Heo
Based on the statistical analysis, we reveal that queries and keys are mapped in completely different spaces while only a few keys are blended into the query region.
no code implementations • 30 Sep 2024 • JiHwan Kim, Youngdo Kim, Hyo Seung Lee, Eunseok Seo, Sang Joon Lee
However, existing deep learning-based phase retrieval methods have technical limitations in generalization performance and three-dimensional (3D) morphology reconstruction from a single-shot hologram of biological cells.
no code implementations • 29 Aug 2024 • JiHwan Kim, Miso Lee, Cheol-Ho Cho, Jihyun Lee, Jae-Pil Heo
Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications.
no code implementations • 23 Aug 2024 • JiHwan Kim, Miso Lee, Jae-Pil Heo
Temporal action detection (TAD) is challenging, yet fundamental for real-world video applications.
1 code implementation • 21 Aug 2024 • Hyeongmin Lee, Jin-Young Kim, Kyungjune Baek, JiHwan Kim, Hyojun Go, Seongsu Ha, Seokjin Han, Jiho Jang, Raehyuk Jung, Daewoo Kim, GeunOh Kim, Jongmok Kim, Jongseok Kim, Junwan Kim, Soonwoo Kwon, JangWon Lee, Seungjoon Park, Minjoon Seo, Jay Suh, Jaehyuk Yi, Aiden Lee
In this work, we discuss evaluating video foundation models in a fair and robust manner.
no code implementations • 19 Aug 2024 • Yerim Jeon, SuBeen Lee, JiHwan Kim, Jae-Pil Heo
Few-shot object counting has garnered significant attention for its practicality as it aims to count target objects in a query image based on given exemplars without the need for additional training.
no code implementations • 18 Aug 2024 • JiHwan Kim, Jaehyun Choi, Yerim Jeon, Jae-Pil Heo
To this end, we propose Boundary-Recovering Network (BRN) to address the vanishing boundary problem.
1 code implementation • 19 May 2024 • JiHwan Kim, Junoh Kang, Jinyoung Choi, Bohyung Han
We propose a novel inference technique based on a pretrained diffusion model for text-conditional video generation.
Ranked #1 on
Video Generation
on UCF-101
(FVD128 metric)
no code implementations • ICCV 2023 • JiHwan Kim, Miso Lee, Jae-Pil Heo
In this paper, we point out the problem in the self-attention of DETR for TAD; the attention modules focus on a few key elements, called temporal collapse problem.
no code implementations • 1 Apr 2023 • Won Ik Cho, Yoon Kyung Lee, Seoyeon Bae, JiHwan Kim, Sangah Park, Moosung Kim, Sowon Hahn, Nam Soo Kim
Building a natural language dataset requires caution since word semantics is vulnerable to subtle text change or the definition of the annotated concept.
no code implementations • 16 Dec 2021 • Sung Hwan Mun, Min Hyun Han, Dongjune Lee, JiHwan Kim, Nam Soo Kim
In this paper, we propose self-supervised speaker representation learning strategies, which comprise of a bootstrap equilibrium speaker representation learning in the front-end and an uncertainty-aware probabilistic speaker embedding training in the back-end.
no code implementations • CVPR 2021 • Sangeek Hyun, JiHwan Kim, Jae-Pil Heo
The proposed tasks enable the discriminators to learn representations of appearance and temporal context, and force the generator to synthesize videos with consistent appearance and natural flow of motions.