Search Results for author: Sheng-Chieh Lin

Found 17 papers, 8 papers with code

In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval

no code implementations ACL (RepL4NLP) 2021 Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin

We present an efficient training approach to text retrieval with dense representations that applies knowledge distillation using the ColBERT late-interaction ranking model.

Document Ranking Knowledge Distillation +2

CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval

1 code implementation18 Nov 2022 Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen

In this paper, we unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing, namely CITADEL, for efficient and effective multi-vector retrieval.


Strong Gravitational Lensing Parameter Estimation with Vision Transformer

1 code implementation9 Oct 2022 Kuan-Wei Huang, Geoff Chih-Fan Chen, Po-Wen Chang, Sheng-Chieh Lin, Chia-Jung Hsu, Vishal Thengane, Joshua Yao-Yu Lin

Quantifying the parameters and corresponding uncertainties of hundreds of strongly lensed quasar systems holds the key to resolving one of the most important scientific questions: the Hubble constant ($H_{0}$) tension.

Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval

1 code implementation31 Jul 2022 Sheng-Chieh Lin, Minghan Li, Jimmy Lin

Our work demonstrates that MLM pre-trained transformers can be used to effectively encode text information into a single-vector for dense retrieval.

Knowledge Distillation Language Modelling +2

A Dense Representation Framework for Lexical and Semantic Matching

1 code implementation20 Jun 2022 Sheng-Chieh Lin, Jimmy Lin

In contrast, our work integrates lexical representations with dense semantic representations by densifying high-dimensional lexical representations into what we call low-dimensional dense lexical representations (DLRs).

Retrieval Semantic Text Matching +2

Densifying Sparse Representations for Passage Retrieval by Representational Slicing

1 code implementation9 Dec 2021 Sheng-Chieh Lin, Jimmy Lin

Learned sparse and dense representations capture different successful approaches to text retrieval and the fusion of their results has proven to be more effective and robust.

Passage Retrieval Retrieval +1

Contextualized Query Embeddings for Conversational Search

no code implementations EMNLP 2021 Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin

This paper describes a compact and effective model for low-latency passage retrieval in conversational search based on learned dense representations.

Conversational Search Open-Domain Question Answering +2

Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling

3 code implementations14 Apr 2021 Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, Allan Hanbury

A vital step towards the widespread adoption of neural retrieval models is their resource efficiency throughout the training, indexing and query workflows.

Re-Ranking Retrieval +2

Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations

1 code implementation19 Feb 2021 Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, Rodrigo Nogueira

Pyserini is an easy-to-use Python toolkit that supports replicable IR research by providing effective first-stage retrieval in a multi-stage ranking architecture.

Information Retrieval Retrieval

Angular clustering and host halo properties of [OII] emitters at $z >1$ in the Subaru HSC survey

no code implementations22 Dec 2020 Teppei Okumura, Masao Hayashi, I-Non Chiu, Yen-Ting Lin, Ken Osato, Bau-Ching Hsieh, Sheng-Chieh Lin

From the constrained HOD model, the average mass of halos hosting the [OII] emitters is derived to be $\log{M_{eff}/(h^{-1}M_\odot)}=12. 70^{+0. 09}_{-0. 07}$ and $12. 61^{+0. 09}_{-0. 05}$ at z=1. 19 and 1. 47, respectively, which will become halos with the present-day mass, $M\sim 1. 5 \times 10^{13}h^{-1}M_\odot$.

Astrophysics of Galaxies Cosmology and Nongalactic Astrophysics

Optical Wavelength Guided Self-Supervised Feature Learning For Galaxy Cluster Richness Estimate

no code implementations4 Dec 2020 Gongbo Liang, Yuanyuan Su, Sheng-Chieh Lin, Yu Zhang, Yuanyuan Zhang, Nathan Jacobs

We believe the proposed method will benefit astronomy and cosmology, where a large number of unlabeled multi-band images are available, but acquiring image labels is costly.


Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models

no code implementations COLING 2020 Jheng-Hong Yang, Sheng-Chieh Lin, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin

While internalized {``}implicit knowledge{''} in pretrained transformers has led to fruitful progress in many natural language understanding tasks, how to most effectively elicit such knowledge remains an open question.

Multiple-choice Natural Language Understanding +1

Distilling Dense Representations for Ranking using Tightly-Coupled Teachers

1 code implementation22 Oct 2020 Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin

We present an approach to ranking with dense representations that applies knowledge distillation to improve the recently proposed late-interaction ColBERT model.

Knowledge Distillation

Personalized TV Recommendation: Fusing User Behavior and Preferences

no code implementations30 Aug 2020 Sheng-Chieh Lin, Ting-Wei Lin, Jing-Kai Lou, Ming-Feng Tsai, Chuan-Ju Wang

In this paper, we propose a two-stage ranking approach for recommending linear TV programs.

Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models

no code implementations4 Apr 2020 Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin

This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs).

Pretrained Language Models Task-Oriented Dialogue Systems

TTTTTackling WinoGrande Schemas

no code implementations18 Mar 2020 Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin

We applied the T5 sequence-to-sequence model to tackle the AI2 WinoGrande Challenge by decomposing each example into two input text strings, each containing a hypothesis, and using the probabilities assigned to the "entailment" token as a score of the hypothesis.


Cannot find the paper you are looking for? You can Submit a new open access paper.