Search Results for author: Wooyoung Kang

Found 6 papers, 3 papers with code

Honeybee: Locality-enhanced Projector for Multimodal LLM

1 code implementation11 Dec 2023 Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh

In Multimodal Large Language Models (MLLMs), a visual projector plays a crucial role in bridging pre-trained vision encoders with LLMs, enabling profound visual understanding while harnessing the LLMs' robust capabilities.

 Ranked #1 on Science Question Answering on ScienceQA (using extra training data)

Science Question Answering

Large Language Models are Temporal and Causal Reasoners for Video Question Answering

1 code implementation24 Oct 2023 Dohwan Ko, Ji Soo Lee, Wooyoung Kang, Byungseok Roh, Hyunwoo J. Kim

We observe that the LLMs provide effective priors in exploiting $\textit{linguistic shortcuts}$ for temporal and causal reasoning in Video Question Answering (VideoQA).

Natural Language Understanding Question Answering +2

Open-Vocabulary Object Detection using Pseudo Caption Labels

no code implementations23 Mar 2023 Han-Cheol Cho, Won Young Jhoo, Wooyoung Kang, Byungseok Roh

Recent open-vocabulary detection methods aim to detect novel objects by distilling knowledge from vision-language models (VLMs) trained on a vast amount of image-text pairs.

Image Captioning Knowledge Distillation +3

Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning

1 code implementation ICCV 2023 Wooyoung Kang, Jonghwan Mun, Sungjun Lee, Byungseok Roh

Image captioning is one of the straightforward tasks that can take advantage of large-scale web-crawled data which provides rich knowledge about the visual world for a captioning model.

Image Captioning Image Retrieval +1

Dense but Efficient VideoQA for Intricate Compositional Reasoning

no code implementations19 Oct 2022 Jihyeon Lee, Wooyoung Kang, Eun-Sol Kim

It is well known that most of the conventional video question answering (VideoQA) datasets consist of easy questions requiring simple reasoning processes.

Question Answering Video Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.