no code implementations • 7 Oct 2024 • Seongjun Jeong, Gi-Cheon Kang, Joochan Kim, Byoung-Tak Zhang
VLN-CM is composed of four modules and predicts the direction and distance of the next movement at each step.
no code implementations • 22 Mar 2024 • Seongjun Jeong, Gi-Cheon Kang, SeongHo Choi, Joochan Kim, Byoung-Tak Zhang
In developing Vision-and-Language Navigation (VLN) agents that navigate to a destination using natural language instructions and visual cues, current studies largely assume a \textit{train-once-deploy-once strategy}.
1 code implementation • 5 Jun 2023 • Minjoon Jung, Youwon Jang, SeongHo Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang
Video moment retrieval (VMR) identifies a specific moment in an untrimmed video for a given natural language query.
Ranked #13 on
Moment Retrieval
on Charades-STA
1 code implementation • 23 Oct 2022 • Minjoon Jung, SeongHo Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang
Video corpus moment retrieval (VCMR) is the task to retrieve the most relevant video moment from a large video corpus using a natural language query.
Ranked #2 on
Video Corpus Moment Retrieval
on TVR