Search Results for author: Jinhyuk Lee

Found 22 papers, 19 papers with code

Rethinking the Role of Token Retrieval in Multi-Vector Retrieval

no code implementations4 Apr 2023 Jinhyuk Lee, Zhuyun Dai, Sai Meher Karthik Duddu, Tao Lei, Iftekhar Naim, Ming-Wei Chang, Vincent Y. Zhao

Multi-vector retrieval models such as ColBERT [Khattab and Zaharia, 2020] allow token-level interactions between queries and documents, and hence achieve state of the art on many information retrieval benchmarks.

Information Retrieval Retrieval

Multi-Vector Retrieval as Sparse Alignment

no code implementations2 Nov 2022 Yujie Qian, Jinhyuk Lee, Sai Meher Karthik Duddu, Zhuyun Dai, Siddhartha Brahma, Iftekhar Naim, Tao Lei, Vincent Y. Zhao

With sparsified unary saliences, we are able to prune a large number of query and document token vectors and improve the efficiency of multi-vector retrieval.

Argument Retrieval Information Retrieval +1

Bridging the Training-Inference Gap for Dense Phrase Retrieval

no code implementations25 Oct 2022 Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang

Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search.

Open-Domain Question Answering Passage Retrieval +1

Optimizing Test-Time Query Representations for Dense Retrieval

1 code implementation25 May 2022 Mujeen Sung, Jungsoo Park, Jaewoo Kang, Danqi Chen, Jinhyuk Lee

In this paper, we introduce TOUR (Test-Time Optimization of Query Representations), which further optimizes instance-level query representations guided by signals from test-time retrieval results.

Contrastive Learning Open-Domain Question Answering +3

BERN2: an advanced neural biomedical named entity recognition and normalization tool

1 code implementation6 Jan 2022 Mujeen Sung, Minbyul Jeong, Yonghwa Choi, Donghyeon Kim, Jinhyuk Lee, Jaewoo Kang

In biomedical natural language processing, named entity recognition (NER) and named entity normalization (NEN) are key tasks that enable the automatic extraction of biomedical entities (e. g. diseases and drugs) from the ever-growing biomedical literature.

graph construction named-entity-recognition +2

Simple Questions Generate Named Entity Recognition Datasets

1 code implementation16 Dec 2021 Hyunjae Kim, Jaehyo Yoo, Seunghyun Yoon, Jinhyuk Lee, Jaewoo Kang

Recent named entity recognition (NER) models often rely on human-annotated datasets, requiring the significant engagement of professional knowledge on the target domain and entities.

Few-shot NER Named Entity Recognition +1

Simple Entity-Centric Questions Challenge Dense Retrievers

1 code implementation EMNLP 2021 Christopher Sciavolino, Zexuan Zhong, Jinhyuk Lee, Danqi Chen

Open-domain question answering has exploded in popularity recently due to the success of dense retrieval models, which have surpassed sparse models using only a few supervised training examples.

Data Augmentation Open-Domain Question Answering +2

Can Language Models be Biomedical Knowledge Bases?

1 code implementation EMNLP 2021 Mujeen Sung, Jinhyuk Lee, Sean Yi, Minji Jeon, Sungdong Kim, Jaewoo Kang

To this end, we create the BioLAMA benchmark, which is comprised of 49K biomedical factual knowledge triples for probing biomedical LMs.

Learning Dense Representations of Phrases at Scale

4 code implementations ACL 2021 Jinhyuk Lee, Mujeen Sung, Jaewoo Kang, Danqi Chen

Open-domain question answering can be reformulated as a phrase retrieval problem, without the need for processing documents on-demand during inference (Seo et al., 2019).

Open-Domain Question Answering Question Generation +4

Biomedical Entity Representations with Synonym Marginalization

3 code implementations ACL 2020 Mujeen Sung, Hwisang Jeon, Jinhyuk Lee, Jaewoo Kang

In this way, we avoid the explicit pre-selection of negative samples from more than 400K candidates.

Look at the First Sentence: Position Bias in Question Answering

1 code implementation EMNLP 2020 Miyoung Ko, Jinhyuk Lee, Hyunjae Kim, Gangwoo Kim, Jaewoo Kang

In this study, we hypothesize that when the distribution of the answer positions is highly skewed in the training set (e. g., answers lie only in the k-th sentence of each passage), QA models predicting answers as positions can learn spurious positional cues and fail to give answers in different positions.

Extractive Question-Answering Question Answering

Adversarial Subword Regularization for Robust Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Jungsoo Park, Mujeen Sung, Jinhyuk Lee, Jaewoo Kang

Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation as NMT models can experience various subword candidates.

Machine Translation NMT +1

Contextualized Sparse Representations for Real-Time Open-Domain Question Answering

3 code implementations ACL 2020 Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang

Open-domain question answering can be formulated as a phrase retrieval problem, in which we can expect huge scalability and speed benefit but often suffer from low accuracy due to the limitation of existing phrase representation models.

Information Retrieval Open-Domain Question Answering +1

Pre-trained Language Model for Biomedical Question Answering

3 code implementations18 Sep 2019 Wonjin Yoon, Jinhyuk Lee, Donghyeon Kim, Minbyul Jeong, Jaewoo Kang

The recent success of question answering systems is largely attributed to pre-trained language models.

Language Modelling Question Answering

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index

1 code implementation ACL 2019 Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi

Existing open-domain question answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query.

Open-Domain Question Answering

Typeface Completion with Generative Adversarial Networks

2 code implementations9 Nov 2018 Yonggyu Park, Junhyun Lee, Yookyung Koh, Inyeop Lee, Jinhyuk Lee, Jaewoo Kang

However, in designing a typeface, it is difficult to keep the style of various characters consistent, especially for languages with lots of morphological variations such as Chinese.

Font Style Transfer Image-to-Image Translation +2

CollaboNet: collaboration of deep neural networks for biomedical named entity recognition

2 code implementations21 Sep 2018 Wonjin Yoon, Chan Ho So, Jinhyuk Lee, Jaewoo Kang

Our model has successfully reduced the number of misclassified entities and improved the performance by leveraging multiple datasets annotated for different entity types.

named-entity-recognition Named Entity Recognition +2

Learning User Preferences and Understanding Calendar Contexts for Event Scheduling

1 code implementation5 Sep 2018 Donghyeon Kim, Jinhyuk Lee, Donghee Choi, Jaehoon Choi, Jaewoo Kang

With online calendar services gaining popularity worldwide, calendar data has become one of the richest context sources for understanding human behavior.


Cannot find the paper you are looking for? You can Submit a new open access paper.