no code implementations • 17 Mar 2023 • Jihyun Lee, Seungyeon Seo, Yunsu Kim, Gary Geunbae Lee
We present our work on Track 2 in the Dialog System Technology Challenges 11 (DSTC11).
1 code implementation • 28 Feb 2023 • Jihyun Lee, Minhyuk Sung, Honggyu Choi, Tae-Kyun Kim
To handle the shape complexity and interaction context between two hands, Im2Hands models the occupancy volume of two hands - conditioned on an RGB image and coarse 3D keypoints - by two novel attention-based modules responsible for (1) initial occupancy estimation and (2) context-aware occupancy refinement, respectively.
no code implementations • 17 Nov 2022 • Jihyun Lee, Chaebin Lee, Yunsu Kim, Gary Geunbae Lee
In dialogue state tracking (DST), labeling the dataset involves considerable human labor.
no code implementations • 16 Sep 2022 • Jihyun Lee, Gary Geunbae Lee
Few-shot dialogue state tracking (DST) model tracks user requests in dialogue with reliable accuracy even with a small amount of data.
no code implementations • CVPR 2022 • Jihyun Lee, Minhyuk Sung, HyunJin Kim, Tae-Kyun Kim
We propose a framework that can deform an object in a 2D image as it exists in 3D space.
no code implementations • 26 Jul 2021 • Se-Yun Um, Jihyun Kim, Jihyun Lee, Hong-Goo Kang
In this paper, we propose a multi-speaker face-to-speech waveform generation model that also works for unseen speaker conditions.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 1 Apr 2021 • Minsu Kang, Jihyun Lee, Simin Kim, Injung Kim
We propose an end-to-end speech synthesizer, Fast DCTTS, that synthesizes speech in real time on a single CPU thread.
no code implementations • 1 Jan 2021 • Tae Gyoon Kang, Ho-Gyeong Kim, Min-Joong Lee, Jihyun Lee, Seongmin Ok, Hoshik Lee, Young Sang Choi
Transformers with soft attention have been widely adopted to various sequence-to-sequence tasks.