Search Results for author: Md Arafat Sultan

Found 22 papers, 6 papers with code

Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations

no code implementations19 Feb 2024 Md Arafat Sultan, Jatin Ganhotra, Ramón Fernandez Astudillo

We introduce a structured chain-of-thought (SCoT) prompting approach to generating content-grounded multi-turn question-answer conversations using a pre-trained large language model (LLM).

Hallucination Language Modelling +1

An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation

no code implementations12 Jan 2024 Md Arafat Sultan, Aashka Trivedi, Parul Awasthy, Avirup Sil

We present a large-scale empirical study of how choices of configuration parameters affect performance in knowledge distillation (KD).

Knowledge Distillation

Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation

no code implementations15 Nov 2023 Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum

In this paper, we present the discovery that a student model distilled from a few-shot prompted LLM can commonly generalize better than its teacher to unseen examples on such tasks.

Constituency Parsing Knowledge Distillation +3

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

1 code implementation21 Oct 2023 Young-suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo

Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision.

In-Context Learning

Inference-time Re-ranker Relevance Feedback for Neural Information Retrieval

no code implementations19 May 2023 Revanth Gangi Reddy, Pradeep Dasigi, Md Arafat Sultan, Arman Cohan, Avirup Sil, Heng Ji, Hannaneh Hajishirzi

Neural information retrieval often adopts a retrieve-and-rerank framework: a bi-encoder network first retrieves K (e. g., 100) candidates that are then re-ranked using a more powerful cross-encoder model to rank the better candidates higher.

Information Retrieval Retrieval

Knowledge Distillation $\approx$ Label Smoothing: Fact or Fallacy?

no code implementations30 Jan 2023 Md Arafat Sultan

Originally proposed as a method for knowledge transfer from one model to another, some recent studies have suggested that knowledge distillation (KD) is in fact a form of regularization.

Knowledge Distillation text-classification +2

PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development

1 code implementation23 Jan 2023 Avirup Sil, Jaydeep Sen, Bhavani Iyer, Martin Franz, Kshitij Fadnis, Mihaela Bornea, Sara Rosenthal, Scott McCarley, Rong Zhang, Vishwajeet Kumar, Yulong Li, Md Arafat Sultan, Riyaz Bhat, Radu Florian, Salim Roukos

The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers.

Question Answering Reading Comprehension +1

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

no code implementations2 Dec 2022 Keshav Santhanam, Jon Saad-Falcon, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sultan, Salim Roukos, Matei Zaharia, Christopher Potts

Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks.

Benchmarking Information Retrieval +1

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

1 code implementation29 Nov 2022 Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.

GAAMA 2.0: An Integrated System that Answers Boolean and Extractive Questions

no code implementations16 Jun 2022 Scott McCarley, Mihaela Bornea, Sara Rosenthal, Anthony Ferritto, Md Arafat Sultan, Avirup Sil, Radu Florian

Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types.

Machine Reading Comprehension

Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering

no code implementations15 May 2022 Md Arafat Sultan, Avirup Sil, Radu Florian

Machine learning models are prone to overfitting their training (source) domains, which is commonly believed to be the reason why they falter in novel target domains.

Domain Generalization Knowledge Distillation +2

Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval

1 code implementation24 Apr 2022 Revanth Gangi Reddy, Md Arafat Sultan, Martin Franz, Avirup Sil, Heng Ji

On two public IR benchmarks, we empirically show that the proposed method helps improve both the model's attention patterns and retrieval performance, including in zero-shot settings.

Information Retrieval Question Generation +3

Learning Cross-Lingual IR from an English Retriever

1 code implementation NAACL 2022 Yulong Li, Martin Franz, Md Arafat Sultan, Bhavani Iyer, Young-suk Lee, Avirup Sil

We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).

Cross-Lingual Information Retrieval Knowledge Distillation +3

Towards Robust Neural Retrieval Models with Synthetic Pre-Training

no code implementations15 Apr 2021 Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.

Information Retrieval Machine Reading Comprehension +1

End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training

no code implementations2 Dec 2020 Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos

End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages.

Domain Adaptation Information Retrieval +3

Improved Synthetic Training for Reading Comprehension

no code implementations24 Oct 2020 Yanda Chen, Md Arafat Sultan, Vittorio Castelli

Automatically generated synthetic training examples have been shown to improve performance in machine reading comprehension (MRC).

Knowledge Distillation Machine Reading Comprehension

Cannot find the paper you are looking for? You can Submit a new open access paper.