Search Results for author: Md Arafat Sultan

Found 22 papers, 6 papers with code

Towards Robust Neural Retrieval with Source Domain Synthetic Pre-Finetuning

no code implementations • COLING 2022 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Research on neural IR has so far been focused primarily on standard supervised learning settings, where it outperforms traditional term matching baselines.

Data Augmentation Domain Generalization +2

Paper
Add Code

Self-Refinement of Language Models from External Proxy Metrics Feedback

no code implementations • 27 Feb 2024 • Keshav Ramji, Young-suk Lee, Ramón Fernandez Astudillo, Md Arafat Sultan, Tahira Naseem, Asim Munawar, Radu Florian, Salim Roukos

It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response.

Question Answering Response Generation

Paper
Add Code

Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations

no code implementations • 19 Feb 2024 • Md Arafat Sultan, Jatin Ganhotra, Ramón Fernandez Astudillo

We introduce a structured chain-of-thought (SCoT) prompting approach to generating content-grounded multi-turn question-answer conversations using a pre-trained large language model (LLM).

Hallucination Language Modelling +1

Paper
Add Code

An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation

no code implementations • 12 Jan 2024 • Md Arafat Sultan, Aashka Trivedi, Parul Awasthy, Avirup Sil

We present a large-scale empirical study of how choices of configuration parameters affect performance in knowledge distillation (KD).

Knowledge Distillation

Paper
Add Code

Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation

no code implementations • 15 Nov 2023 • Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum

In this paper, we present the discovery that a student model distilled from a few-shot prompted LLM can commonly generalize better than its teacher to unseen examples on such tasks.

Constituency Parsing Knowledge Distillation +3

Paper
Add Code

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

1 code implementation • 21 Oct 2023 • Young-suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo

Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision.

In-Context Learning

Paper
Code

Inference-time Re-ranker Relevance Feedback for Neural Information Retrieval

no code implementations • 19 May 2023 • Revanth Gangi Reddy, Pradeep Dasigi, Md Arafat Sultan, Arman Cohan, Avirup Sil, Heng Ji, Hannaneh Hajishirzi

Neural information retrieval often adopts a retrieve-and-rerank framework: a bi-encoder network first retrieves K (e. g., 100) candidates that are then re-ranked using a more powerful cross-encoder model to rank the better candidates higher.

Information Retrieval Retrieval

Paper
Add Code

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

1 code implementation • 1 Mar 2023 • Jon Saad-Falcon, Omar Khattab, Keshav Santhanam, Radu Florian, Martin Franz, Salim Roukos, Avirup Sil, Md Arafat Sultan, Christopher Potts

Many information retrieval tasks require large labeled datasets for fine-tuning.

Information Retrieval Retrieval +1

699

Paper
Code

Knowledge Distillation $\approx$ Label Smoothing: Fact or Fallacy?

no code implementations • 30 Jan 2023 • Md Arafat Sultan

Originally proposed as a method for knowledge transfer from one model to another, some recent studies have suggested that knowledge distillation (KD) is in fact a form of regularization.

Knowledge Distillation text-classification +2

Paper
Add Code

PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development

1 code implementation • 23 Jan 2023 • Avirup Sil, Jaydeep Sen, Bhavani Iyer, Martin Franz, Kshitij Fadnis, Mihaela Bornea, Sara Rosenthal, Scott McCarley, Rong Zhang, Vishwajeet Kumar, Yulong Li, Md Arafat Sultan, Riyaz Bhat, Radu Florian, Salim Roukos

The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers.

Question Answering Reading Comprehension +1

699

Paper
Code

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

no code implementations • 2 Dec 2022 • Keshav Santhanam, Jon Saad-Falcon, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sultan, Salim Roukos, Matei Zaharia, Christopher Potts

Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks.

Benchmarking Information Retrieval +1

Paper
Add Code

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

1 code implementation • 29 Nov 2022 • Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.

Paper
Code

GAAMA 2.0: An Integrated System that Answers Boolean and Extractive Questions

no code implementations • 16 Jun 2022 • Scott McCarley, Mihaela Bornea, Sara Rosenthal, Anthony Ferritto, Md Arafat Sultan, Avirup Sil, Radu Florian

Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types.

Machine Reading Comprehension

Paper
Add Code

Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering

no code implementations • 15 May 2022 • Md Arafat Sultan, Avirup Sil, Radu Florian

Machine learning models are prone to overfitting their training (source) domains, which is commonly believed to be the reason why they falter in novel target domains.

Domain Generalization Knowledge Distillation +2

Paper
Add Code

Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval

1 code implementation • 24 Apr 2022 • Revanth Gangi Reddy, Md Arafat Sultan, Martin Franz, Avirup Sil, Heng Ji

On two public IR benchmarks, we empirically show that the proposed method helps improve both the model's attention patterns and retrieval performance, including in zero-shot settings.

Information Retrieval Question Generation +3

Paper
Code

Synthetic Target Domain Supervision for Open Retrieval QA

no code implementations • 20 Apr 2022 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avirup Sil, Vittorio Castelli, Radu Florian, Salim Roukos

Neural passage retrieval is a new and promising approach in open retrieval question answering.

Passage Retrieval Question Answering +1

Paper
Add Code

Learning Cross-Lingual IR from an English Retriever

1 code implementation • NAACL 2022 • Yulong Li, Martin Franz, Md Arafat Sultan, Bhavani Iyer, Young-suk Lee, Avirup Sil

We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).

Cross-Lingual Information Retrieval Knowledge Distillation +3

699

Paper
Code

Towards Robust Neural Retrieval Models with Synthetic Pre-Training

no code implementations • 15 Apr 2021 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.

Information Retrieval Machine Reading Comprehension +1

Paper
Add Code

End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training

no code implementations • 2 Dec 2020 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos

End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages.

Domain Adaptation Information Retrieval +3

Paper
Add Code

Answer Span Correction in Machine Reading Comprehension

no code implementations • Findings of the Association for Computational Linguistics 2020 • Revanth Gangi Reddy, Md Arafat Sultan, Efsun Sarioglu Kayi, Rong Zhang, Vittorio Castelli, Avirup Sil

Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair.

Machine Reading Comprehension

Paper
Add Code

Improved Synthetic Training for Reading Comprehension

no code implementations • 24 Oct 2020 • Yanda Chen, Md Arafat Sultan, Vittorio Castelli

Automatically generated synthetic training examples have been shown to improve performance in machine reading comprehension (MRC).

Knowledge Distillation Machine Reading Comprehension

Paper
Add Code

Multi-Stage Pre-training for Low-Resource Domain Adaptation

no code implementations • EMNLP 2020 • Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain.

Document Ranking Domain Adaptation +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.