Search Results for author: Taehwan Kim

Found 13 papers, 3 papers with code

Multi-aspect Knowledge Distillation with Large Language Model

1 code implementation23 Jan 2025 Taegyeong Lee, Jinsik Bang, Soyeong Kwon, Taehwan Kim

These results demonstrate that our method can transfer knowledge about various aspects to the model and the aspect knowledge can enhance model performance in computer vision tasks.

Image Classification Knowledge Distillation +3

Leveraging 2D Masked Reconstruction for Domain Adaptation of 3D Pose Estimation

no code implementations14 Jan 2025 Hansoo Park, Chanwoo Kim, Jihyeon Kim, Hoseong Cho, Nhat Nguyen Bao Truong, Taehwan Kim, Seungryul Baek

RGB-based 3D pose estimation methods have been successful with the development of deep learning and the emergence of high-quality 3D pose datasets.

3D Pose Estimation Hand Pose Estimation +1

Zero-shot Text-guided Infinite Image Synthesis with LLM guidance

no code implementations17 Jul 2024 Soyeong Kwon, Taegyeong Lee, Taehwan Kim

Furthermore, our model demonstrates the capability of text-guided arbitrary-sized image generation in zero-shot manner with LLM guidance.

Image Generation text-guided-image-editing

Grid Diffusion Models for Text-to-Video Generation

no code implementations CVPR 2024 Taegyeong Lee, Soyeong Kwon, Taehwan Kim

To tackle these challenges, we propose a simple but effective novel grid diffusion for text-to-video generation without temporal dimension in architecture and a large text-video paired dataset.

Image Manipulation Text-to-Image Generation +2

Effective Slogan Generation with Noise Perturbation

1 code implementation6 Oct 2023 Jongeun Kim, MinChung Kim, Taehwan Kim

Slogans play a crucial role in building the brand's identity of the firm.

Generating Realistic Images from In-the-wild Sounds

no code implementations ICCV 2023 Taegyeong Lee, Jeonghun Kang, Hyeonyu Kim, Taehwan Kim

Representing wild sounds as images is an important but challenging task due to the lack of paired datasets between sound and images and the significant differences in the characteristics of these two modalities.

Audio captioning Sentence

Technical Report for CVPR 2022 LOVEU AQTC Challenge

1 code implementation29 Jun 2022 Hyeonyu Kim, Jongeun Kim, Jeonghun Kang, Sanguk Park, Dongchan Park, Taehwan Kim

This technical report presents the 2nd winning model for AQTC, a task newly introduced in CVPR 2022 LOng-form VidEo Understanding (LOVEU) challenges.

Video Understanding

Understanding Beauty via Deep Facial Features

no code implementations30 Jan 2019 Xudong Liu, Tao Li, Hao Peng, Iris Chuoying Ouyang, Taehwan Kim, Ruizhe Wang

The concept of beauty has been debated by philosophers and psychologists for centuries, but most definitions are subjective and metaphysical, and deficit in accuracy, generality, and scalability.

Generative Adversarial Network

Lexicon-Free Fingerspelling Recognition from Video: Data, Models, and Signer Adaptation

no code implementations26 Sep 2016 Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu

Recognizing fingerspelling is challenging for a number of reasons: It involves quick, small motions that are often highly coarticulated; it exhibits significant variation between signers; and there has been a dearth of continuous fingerspelling data collected.

American Sign Language fingerspelling recognition from video: Methods for unrestricted recognition and signer-independence

no code implementations30 Aug 2016 Taehwan Kim

In this thesis, we study the problem of recognizing video sequences of fingerspelled letters in American Sign Language (ASL).

Signer-independent Fingerspelling Recognition with Deep Neural Network Adaptation

no code implementations13 Feb 2016 Taehwan Kim, Weiran Wang, Hao Tang, Karen Livescu

Previous work has shown that it is possible to achieve almost 90% accuracies on fingerspelling recognition in a signer-dependent setting.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sparse Coding for Learning Interpretable Spatio-Temporal Primitives

no code implementations NeurIPS 2010 Taehwan Kim, Gregory Shakhnarovich, Raquel Urtasun

Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images.

Cannot find the paper you are looking for? You can Submit a new open access paper.