no code implementations • 3 Oct 2024 • Jaewoo Lee, Joonho Ko, Jinheon Baek, Soyeong Jeong, Sung Ju Hwang
Moreover, to mitigate the information loss from segmenting documents into passages, instead of representing and retrieving passages individually, we further merge the representations of segmented passages into one single document representation, while we additionally introduce a reranking strategy to decouple and identify the relevant passage within the document if necessary.
no code implementations • 23 Jun 2024 • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park
Information retrieval models that aim to search for the documents relevant to the given query have shown many successes, which have been applied to diverse tasks.
no code implementations • 10 Jun 2024 • David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song, Henok Biadglign Ademtew, Hernán Maina, Holy Lovenia, Israel Abebe Azime, Jan Christian Blaise Cruz, Jay Gala, Jiahui Geng, Jesus-German Ortiz-Barajas, Jinheon Baek, Jocelyn Dunstan, Laura Alonso Alemany, Kumaranage Ravindu Yasas Nagasinghe, Luciana Benotti, Luis Fernando D'Haro, Marcelo Viridiano, Marcos Estecha-Garitagoitia, Maria Camila Buitrago Cabrera, Mario Rodríguez-Cantelar, Mélanie Jouitteau, Mihail Mihaylov, Mohamed Fazli Mohamed Imam, Muhammad Farid Adilazuarda, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Naome Etori, Olivier Niyomugisha, Paula Mónica Silva, Pranjal Chitale, Raj Dabre, Rendi Chevi, Ruochen Zhang, Ryandito Diandaru, Samuel Cahyawijaya, Santiago Góngora, Soyeong Jeong, Sukannya Purkayastha, Tatsuki Kuribayashi, Thanmay Jayakumar, Tiago Timponi Torrent, Toqeer Ehsan, Vladimir Araujo, Yova Kementchedjhieva, Zara Burzo, Zheng Wei Lim, Zheng Xin Yong, Oana Ignat, Joan Nwatu, Rada Mihalcea, Thamar Solorio, Alham Fikri Aji
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data.
1 code implementation • 9 Jun 2024 • Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang, Seonghyeon Ye, Bill Yuchen Lin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
To overcome these limitations, we introduce the BiGGen Bench, a principled generation benchmark designed to thoroughly evaluate nine distinct capabilities of LMs across 77 diverse tasks.
no code implementations • 11 Apr 2024 • Jinheon Baek, Sujay Kumar Jauhar, Silviu Cucerzan, Sung Ju Hwang
Scientific Research, vital for improving human life, is hindered by its inherent complexity, slow pace, and the need for specialized experts.
2 code implementations • 21 Mar 2024 • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park
Retrieval-Augmented Large Language Models (LLMs), which incorporate the non-parametric knowledge from external knowledge bases into LLMs, have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA).
no code implementations • 21 Feb 2024 • Minju Seo, Jinheon Baek, James Thorne, Sung Ju Hwang
Many existing works tackle this problem by generating synthetic data from the training data and then training models on them, recently using Large Language Models (LLMs).
no code implementations • 10 Nov 2023 • Jinheon Baek, Nirupama Chandrasekaran, Silviu Cucerzan, Allen herring, Sujay Kumar Jauhar
Specifically, we construct an entity-centric knowledge store for each user based on their search and browsing activities on the web, which is then leveraged to provide contextually relevant LLM prompt augmentations.
1 code implementation • 20 Oct 2023 • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park
Moreover, further finetuning LMs with labeled datasets is often infeasible due to their absence, but it is also questionable if we can transfer smaller LMs having limited knowledge only with unlabeled test data.
1 code implementation • 19 Oct 2023 • Jinheon Baek, Soyeong Jeong, Minki Kang, Jong C. Park, Sung Ju Hwang
Recent Language Models (LMs) have shown impressive capabilities in generating texts with the knowledge internalized in parameters.
1 code implementation • 7 Jun 2023 • Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park
To address this problem, we further introduce a novel contrastive learning strategy, making sure to reflect previous turns when retrieving the phrase for the current context, by maximizing representational similarities of consecutive turns in a conversation while minimizing irrelevant conversational contexts.
no code implementations • 7 Jun 2023 • Jinheon Baek, Alham Fikri Aji, Amir Saffari
We validate the performance of our KAPING framework on the knowledge graph question answering task, that aims to answer the user's question based on facts over a knowledge graph, on which ours outperforms relevant zero-shot baselines by up to 48% in average, across multiple LLMs of various sizes.
no code implementations • 30 May 2023 • Minki Kang, Jin Myung Kwak, Jinheon Baek, Sung Ju Hwang
To overcome this limitation, we propose SUbgraph Retrieval-augmented GEneration (SURGE), a framework for generating context-relevant and knowledge-grounded dialogues with the KG.
1 code implementation • NeurIPS 2023 • Minki Kang, Seanie Lee, Jinheon Baek, Kenji Kawaguchi, Sung Ju Hwang
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge.
no code implementations • 21 May 2023 • Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang
There has been a surge of interest in utilizing Knowledge Graphs (KGs) for various natural language processing/understanding tasks.
1 code implementation • 10 Feb 2023 • Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park
Conversational Question Answering (ConvQA) models aim at answering a question with its relevant paragraph and previous question-answer pairs that occurred during conversation multiple times.
no code implementations • 23 Aug 2022 • Jongha Kim, Jinheon Baek, Sung Ju Hwang
To achieve this, we first detect objects and then measure their semantic and spatial distances to construct an object graph, which is then represented by a graph neural network (GNN) for refining visual CNN features for objects.
1 code implementation • 21 Jun 2022 • Jinheon Baek, Wonyong Jeong, Jiongdao Jin, Jaehong Yoon, Sung Ju Hwang
To this end, we introduce a new subgraph FL problem, personalized subgraph FL, which focuses on the joint improvement of the interrelated local GNNs rather than learning a single global model, and propose a novel framework, FEDerated Personalized sUBgraph learning (FED-PUB), to tackle it.
1 code implementation • NAACL 2022 • Minki Kang, Jinheon Baek, Sung Ju Hwang
Pre-trained language models (PLMs) have achieved remarkable success on various natural language understanding tasks.
1 code implementation • ACL 2022 • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park
Dense retrieval models, which aim at retrieving the most relevant document for an input query on a dense representation space, have gained considerable attention for their remarkable success.
Ranked #1000000000 on Passage Retrieval on Natural Questions
1 code implementation • 7 Feb 2022 • DongKi Kim, Jinheon Baek, Sung Ju Hwang
Contrastive learning, while it can learn global graph-level similarities, its objective to maximize the similarity between two differently perturbed graphs may result in representations that cannot discriminate two similar graphs with different properties.
2 code implementations • NeurIPS 2021 • Jaehyeong Jo, Jinheon Baek, Seul Lee, DongKi Kim, Minki Kang, Sung Ju Hwang
This dual hypergraph construction allows us to apply message-passing techniques for node representations to edges.
1 code implementation • NAACL (sdp) 2021 • Soyeong Jeong, Jinheon Baek, ChaeHun Park, Jong C. Park
In this paper, we propose an Unsupervised Document Expansion with Generation (UDEG) framework with a pre-trained language model, which generates diverse supplementary sentences for the original document without using labels on query-document pairs for training.
1 code implementation • NeurIPS 2021 • Wonyong Jeong, Hayeon Lee, Gun Park, Eunyoung Hyung, Jinheon Baek, Sung Ju Hwang
To address such limitations, we introduce a novel problem of \emph{Neural Network Search} (NNS), whose goal is to search for the optimal pretrained network for a novel dataset and constraints (e. g. number of parameters), from a model zoo.
1 code implementation • ICLR 2021 • Jinheon Baek, Minki Kang, Sung Ju Hwang
Graph neural networks have been widely used on modeling graph data, achieving impressive results on node classification and link prediction tasks.
Ranked #1 on Graph Classification on ToxCast
1 code implementation • NeurIPS 2020 • Jinheon Baek, Dong Bok Lee, Sung Ju Hwang
For transductive link prediction, we further propose a stochastic embedding layer to model uncertainty in the link prediction between unseen entities.
no code implementations • 11 Apr 2020 • Hyunjae Kim, Yookyung Koh, Jinheon Baek, Jaewoo Kang
Also, we analyze how neural models solve spatial reasoning tests with visual aids.