1 code implementation • 23 Jan 2025 • Taegyeong Lee, Jinsik Bang, Soyeong Kwon, Taehwan Kim
These results demonstrate that our method can transfer knowledge about various aspects to the model and the aspect knowledge can enhance model performance in computer vision tasks.
no code implementations • 14 Jan 2025 • Hansoo Park, Chanwoo Kim, Jihyeon Kim, Hoseong Cho, Nhat Nguyen Bao Truong, Taehwan Kim, Seungryul Baek
RGB-based 3D pose estimation methods have been successful with the development of deep learning and the emergence of high-quality 3D pose datasets.
no code implementations • 17 Jul 2024 • Soyeong Kwon, Taegyeong Lee, Taehwan Kim
Furthermore, our model demonstrates the capability of text-guided arbitrary-sized image generation in zero-shot manner with LLM guidance.
no code implementations • CVPR 2024 • Taegyeong Lee, Soyeong Kwon, Taehwan Kim
To tackle these challenges, we propose a simple but effective novel grid diffusion for text-to-video generation without temporal dimension in architecture and a large text-video paired dataset.
Ranked #24 on
Video Generation
on UCF-101
1 code implementation • 6 Oct 2023 • Jongeun Kim, MinChung Kim, Taehwan Kim
Slogans play a crucial role in building the brand's identity of the firm.
no code implementations • ICCV 2023 • Taegyeong Lee, Jeonghun Kang, Hyeonyu Kim, Taehwan Kim
Representing wild sounds as images is an important but challenging task due to the lack of paired datasets between sound and images and the significant differences in the characteristics of these two modalities.
1 code implementation • 29 Jun 2022 • Hyeonyu Kim, Jongeun Kim, Jeonghun Kang, Sanguk Park, Dongchan Park, Taehwan Kim
This technical report presents the 2nd winning model for AQTC, a task newly introduced in CVPR 2022 LOng-form VidEo Understanding (LOVEU) challenges.
no code implementations • 30 Jan 2019 • Xudong Liu, Tao Li, Hao Peng, Iris Chuoying Ouyang, Taehwan Kim, Ruizhe Wang
The concept of beauty has been debated by philosophers and psychologists for centuries, but most definitions are subjective and metaphysical, and deficit in accuracy, generality, and scalability.
no code implementations • 16 Jun 2018 • Chao Yang, Taehwan Kim, Ruizhe Wang, Hao Peng, C. -C. Jay Kuo
It has been applied to numerous domains, such as data augmentation, domain adaptation, and unsupervised training.
no code implementations • 26 Sep 2016 • Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu
Recognizing fingerspelling is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected.
no code implementations • 30 Aug 2016 • Taehwan Kim
In this thesis, we study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL).
no code implementations • 13 Feb 2016 • Taehwan Kim, Weiran Wang, Hao Tang, Karen Livescu
Previous work has shown that it is possible to achieve almost 90% accuracies on fingerspelling recognition in a signer-dependent setting.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • NeurIPS 2010 • Taehwan Kim, Gregory Shakhnarovich, Raquel Urtasun
Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images.