no code implementations • 21 Mar 2024 • Junhyeong Cho, Kim Youwang, Hunmin Yang, Tae-Hyun Oh
One of the biggest challenges in single-view 3D shape reconstruction in the wild is the scarcity of <3D shape, 2D image>-paired data from real-world environments.
no code implementations • ICCV 2023 • Junhyeong Cho, Gilhyun Nam, Sungyeon Kim, Hunmin Yang, Suha Kwak
In a joint vision-language space, a text feature (e. g., from "a photo of a dog") could effectively represent its relevant image features (e. g., from dog photos).
Ranked #1 on Domain Generalization on DomainNet
1 code implementation • 27 Jul 2022 • Junhyeong Cho, Kim Youwang, Tae-Hyun Oh
Transformer encoder architectures have recently achieved state-of-the-art results on monocular 3D human mesh reconstruction, but they require a substantial number of parameters and expensive computations.
Ranked #3 on 3D Human Pose Estimation on EMDB
3 code implementations • CVPR 2022 • Junhyeong Cho, Youngseok Yoon, Suha Kwak
To implement this idea, we propose Collaborative Glance-Gaze TransFormer (CoFormer) that consists of two modules: Glance transformer for activity classification and Gaze transformer for entity estimation.
Ranked #2 on Situation Recognition on imSitu
1 code implementation • 19 Nov 2021 • Junhyeong Cho, Youngseok Yoon, Hyeonjun Lee, Suha Kwak
Grounded Situation Recognition (GSR) is the task that not only classifies a salient action (verb), but also predicts entities (nouns) associated with semantic roles and their locations in the given image.
Ranked #5 on Situation Recognition on imSitu