Search Results for author: Jong C. Park

Found 28 papers, 13 papers with code

GeezSwitch: Language Identification in Typologically Related Low-resourced East African Languages

1 code implementation LREC 2022 Fitsum Gaim, Wonsuk Yang, Jong C. Park

Language identification is one of the fundamental tasks in natural language processing that is a prerequisite to data processing and numerous applications.

Language Identification Machine Translation

DSLR: Document Refinement with Sentence-Level Re-ranking and Reconstruction to Enhance Retrieval-Augmented Generation

no code implementations4 Jul 2024 Taeho Hwang, Soyeong Jeong, Sukmin Cho, SeungYoon Han, Jong C. Park

Recent advancements in Large Language Models (LLMs) have significantly improved their performance across various Natural Language Processing (NLP) tasks.

RAG Re-Ranking +2

Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

no code implementations3 Jul 2024 Eui Jun Hwang, Sukmin Cho, Huije Lee, Youngwoo Yoon, Jong C. Park

Gloss-free methods have emerged to address these limitations, but they often depend on external sign language data or dictionaries, failing to completely eliminate the need for gloss annotations.

Gloss-free Sign Language Translation Self-Supervised Learning +4

Database-Augmented Query Representation for Information Retrieval

no code implementations23 Jun 2024 Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

Information retrieval models that aim to search for the documents relevant to the given query have shown many successes, which have been applied to diverse tasks.

Information Retrieval Retrieval

Self-Knowledge Distillation for Learning Ambiguity

no code implementations14 Jun 2024 Hancheol Park, Soyeong Jeong, Sukmin Cho, Jong C. Park

To address this issue, we propose a novel self-knowledge distillation method that enables models to learn label distributions more accurately by leveraging knowledge distilled from their lower layers.

Natural Language Understanding Self-Knowledge Distillation

Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models

no code implementations6 Jun 2024 Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, Jong C. Park

To this end, we propose a novel strategy to intuitively quantify these social perceptions and suggest metrics that can evaluate the social biases within LLMs by aggregating diverse social perceptions.

Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations

no code implementations22 Apr 2024 Sukmin Cho, Soyeong Jeong, Jeongyeon Seo, Taeho Hwang, Jong C. Park

The robustness of recent Large Language Models (LLMs) has become increasingly crucial as their applicability expands across various domains and real-world applications.

RAG

Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity

2 code implementations21 Mar 2024 Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

Retrieval-Augmented Large Language Models (LLMs), which incorporate the non-parametric knowledge from external knowledge bases into LLMs, have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA).

Question Answering RAG +1

Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering

no code implementations26 Oct 2023 Sukmin Cho, Jeongyeon Seo, Soyeong Jeong, Jong C. Park

Large language models (LLMs) enable zero-shot approaches in open-domain question answering (ODQA), yet with limited advancements as the reader is compared to the retriever.

Answer Selection Negation +1

Test-Time Self-Adaptive Small Language Models for Question Answering

1 code implementation20 Oct 2023 Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

Moreover, further finetuning LMs with labeled datasets is often infeasible due to their absence, but it is also questionable if we can transfer smaller LMs having limited knowledge only with unlabeled test data.

General Knowledge Question Answering

Knowledge-Augmented Language Model Verification

1 code implementation19 Oct 2023 Jinheon Baek, Soyeong Jeong, Minki Kang, Jong C. Park, Sung Ju Hwang

Recent Language Models (LMs) have shown impressive capabilities in generating texts with the knowledge internalized in parameters.

Language Modelling Question Answering +1

Autoregressive Sign Language Production: A Gloss-Free Approach with Discrete Representations

no code implementations21 Sep 2023 Eui Jun Hwang, Huije Lee, Jong C. Park

Gloss-free Sign Language Production (SLP) offers a direct translation of spoken language sentences into sign language, bypassing the need for gloss intermediaries.

Quantization Sign Language Production +1

Deep Model Compression Also Helps Models Capture Ambiguity

1 code implementation12 Jun 2023 Hancheol Park, Jong C. Park

Natural language understanding (NLU) tasks face a non-trivial amount of ambiguous samples where veracity of their labels is debatable among annotators.

Model Compression Natural Language Understanding

Phrase Retrieval for Open-Domain Conversational Question Answering with Conversational Dependency Modeling via Contrastive Learning

1 code implementation7 Jun 2023 Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park

To address this problem, we further introduce a novel contrastive learning strategy, making sure to reflect previous turns when retrieving the phrase for the current context, by maximizing representational similarities of consecutive turns in a conversation while minimizing irrelevant conversational contexts.

Contrastive Learning Conversational Question Answering +1

A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires

1 code implementation5 Jun 2023 Hoyun Song, Jisu Shin, Huije Lee, Jong C. Park

Our detailed analysis shows that the proposed model is effective at leveraging domain knowledge, transferable to other mental disorders, and providing interpretable detection results.

Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker

1 code implementation23 May 2023 Sukmin Cho, Soyeong Jeong, Jeongyeon Seo, Jong C. Park

Along with highlighting the impact of optimization on the zero-shot re-ranker, we propose a novel discrete prompt optimization method, Constrained Prompt generation (Co-Prompt), with the metric estimating the optimum for re-ranking.

Information Retrieval Language Modelling +2

Realistic Conversational Question Answering with Answer Selection based on Calibrated Confidence and Uncertainty Measurement

1 code implementation10 Feb 2023 Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park

Conversational Question Answering (ConvQA) models aim at answering a question with its relevant paragraph and previous question-answer pairs that occurred during conversation multiple times.

Answer Selection Conversational Question Answering

ELF22: A Context-based Counter Trolling Dataset to Combat Internet Trolls

1 code implementation LREC 2022 Huije Lee, Young Ju NA, Hoyun Song, Jisu Shin, Jong C. Park

In particular, we constructed a pair-wise dataset that includes troll comments and counter responses with labeled response strategies, which enables models fine-tuned on our dataset to generate responses by varying counter responses according to the specified strategy.

Response Generation Sentence

Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation

1 code implementation ACL 2022 Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

Dense retrieval models, which aim at retrieving the most relevant document for an input query on a dense representation space, have gained considerable attention for their remarkable success.

Data Augmentation Passage Retrieval +1

Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation

1 code implementation NAACL (sdp) 2021 Soyeong Jeong, Jinheon Baek, ChaeHun Park, Jong C. Park

In this paper, we propose an Unsupervised Document Expansion with Generation (UDEG) framework with a pre-trained language model, which generates diverse supplementary sentences for the original document without using labels on query-document pairs for training.

Information Retrieval Language Modelling +2

Extraction of Gene-Environment Interaction from the Biomedical Literature

no code implementations IJCNLP 2017 Jinseon You, Jin-Woo Chung, Wonsuk Yang, Jong C. Park

Genetic information in the literature has been extensively looked into for the purpose of discovering the etiology of a disease.

Decoder Named Entity Recognition (NER)

Cannot find the paper you are looking for? You can Submit a new open access paper.