Search Results for author: Avirup Sil

Found 43 papers, 11 papers with code

An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation

no code implementations12 Jan 2024 Md Arafat Sultan, Aashka Trivedi, Parul Awasthy, Avirup Sil

We present a large-scale empirical study of how choices of configuration parameters affect performance in knowledge distillation (KD).

Knowledge Distillation

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

2 code implementations17 Oct 2023 Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi

Our framework trains a single arbitrary LM that adaptively retrieves passages on-demand, and generates and reflects on retrieved passages and its own generations using special tokens, called reflection tokens.

Fact Verification Response Generation +1

Inference-time Re-ranker Relevance Feedback for Neural Information Retrieval

no code implementations19 May 2023 Revanth Gangi Reddy, Pradeep Dasigi, Md Arafat Sultan, Arman Cohan, Avirup Sil, Heng Ji, Hannaneh Hajishirzi

Neural information retrieval often adopts a retrieve-and-rerank framework: a bi-encoder network first retrieves K (e. g., 100) candidates that are then re-ranked using a more powerful cross-encoder model to rank the better candidates higher.

Information Retrieval Retrieval

PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development

1 code implementation23 Jan 2023 Avirup Sil, Jaydeep Sen, Bhavani Iyer, Martin Franz, Kshitij Fadnis, Mihaela Bornea, Sara Rosenthal, Scott McCarley, Rong Zhang, Vishwajeet Kumar, Yulong Li, Md Arafat Sultan, Riyaz Bhat, Radu Florian, Salim Roukos

The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers.

Question Answering Reading Comprehension +1

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

no code implementations2 Dec 2022 Keshav Santhanam, Jon Saad-Falcon, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sultan, Salim Roukos, Matei Zaharia, Christopher Potts

Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks.

Benchmarking Information Retrieval +1

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

1 code implementation29 Nov 2022 Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.

Zero-Shot Dynamic Quantization for Transformer Inference

4 code implementations17 Nov 2022 Yousef El-Kurdi, Jerry Quinn, Avirup Sil

We introduce a novel run-time method for significantly reducing the accuracy loss associated with quantizing BERT-like models to 8-bit integers.

Quantization

GAAMA 2.0: An Integrated System that Answers Boolean and Extractive Questions

no code implementations16 Jun 2022 Scott McCarley, Mihaela Bornea, Sara Rosenthal, Anthony Ferritto, Md Arafat Sultan, Avirup Sil, Radu Florian

Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types.

Machine Reading Comprehension

Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering

no code implementations15 May 2022 Md Arafat Sultan, Avirup Sil, Radu Florian

Machine learning models are prone to overfitting their training (source) domains, which is commonly believed to be the reason why they falter in novel target domains.

Domain Generalization Knowledge Distillation +2

Entity-Conditioned Question Generation for Robust Attention Distribution in Neural Information Retrieval

1 code implementation24 Apr 2022 Revanth Gangi Reddy, Md Arafat Sultan, Martin Franz, Avirup Sil, Heng Ji

On two public IR benchmarks, we empirically show that the proposed method helps improve both the model's attention patterns and retrieval performance, including in zero-shot settings.

Information Retrieval Question Generation +3

MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding

2 code implementations20 Dec 2021 Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji

Specifically, the task involves multi-hop questions that require reasoning over image-caption pairs to identify the grounded visual object being referred to and then predicting a span from the news body text to answer the question.

Answer Generation Data Augmentation +2

Learning Cross-Lingual IR from an English Retriever

1 code implementation NAACL 2022 Yulong Li, Martin Franz, Md Arafat Sultan, Bhavani Iyer, Young-suk Lee, Avirup Sil

We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).

Cross-Lingual Information Retrieval Knowledge Distillation +3

Do Answers to Boolean Questions Need Explanations? Yes

no code implementations14 Dec 2021 Sara Rosenthal, Mihaela Bornea, Avirup Sil, Radu Florian, Scott McCarley

Existing datasets that contain boolean questions, such as BoolQ and TYDI QA , provide the user with a YES/NO response to the question.

Improved Text Classification via Contrastive Adversarial Training

no code implementations21 Jul 2021 Lin Pan, Chung-Wei Hang, Avirup Sil, Saloni Potdar

We propose a simple and general method to regularize the fine-tuning of Transformer-based encoders for text classification tasks.

Contrastive Learning intent-classification +4

VAULT: VAriable Unified Long Text Representation for Machine Reading Comprehension

no code implementations ACL 2021 Haoyang Wen, Anthony Ferritto, Heng Ji, Radu Florian, Avirup Sil

Existing models on Machine Reading Comprehension (MRC) require complex model architecture for effectively modeling long texts with paragraph representation and classification, thereby making inference computationally inefficient for production use.

Machine Reading Comprehension Natural Questions

Towards Robust Neural Retrieval Models with Synthetic Pre-Training

no code implementations15 Apr 2021 Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.

Information Retrieval Machine Reading Comprehension +1

Are Multilingual BERT models robust? A Case Study on Adversarial Attacks for Multilingual Question Answering

no code implementations15 Apr 2021 Sara Rosenthal, Mihaela Bornea, Avirup Sil

Recent approaches have exploited weaknesses in monolingual question answering (QA) models by adding adversarial statements to the passage.

Question Answering

Towards Confident Machine Reading Comprehension

no code implementations20 Jan 2021 Rishav Chakravarti, Avirup Sil

Performance prediction is particularly important in cases of domain shift (as measured by training RC models on SQUAD 2. 0 and evaluating on NQ), where Mr. C not only improves AUC, but also traditional answerability prediction (as measured by a 5 point improvement in F1).

Extractive Question-Answering Machine Reading Comprehension +1

Multilingual Transfer Learning for QA Using Translation as Data Augmentation

no code implementations10 Dec 2020 Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil

Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages.

Cross-Lingual Transfer Data Augmentation +4

Cross-lingual Structure Transfer for Relation and Event Extraction

no code implementations IJCNLP 2019 Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, Clare Voss

The identification of complex semantic structures such as events and entity relations, already a challenging Information Extraction task, is doubly difficult from sources written in under-resourced and under-annotated languages.

Event Extraction Relation +1

Ensembling Strategies for Answering Natural Questions

no code implementations30 Oct 2019 Anthony Ferritto, Lin Pan, Rishav Chakravarti, Salim Roukos, Radu Florian, J. William Murdock, Avirup Sil

Many of the top question answering systems today utilize ensembling to improve their performance on tasks such as the Stanford Question Answering Dataset (SQuAD) and Natural Questions (NQ) challenges.

Natural Questions Question Answering

Structured Pruning of a BERT-based Question Answering Model

no code implementations14 Oct 2019 J. S. McCarley, Rishav Chakravarti, Avirup Sil

The recent trend in industry-setting Natural Language Processing (NLP) research has been to operate large %scale pretrained language models like BERT under strict computational limits.

Model Compression Natural Questions +1

Frustratingly Easy Natural Question Answering

no code implementations11 Sep 2019 Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil

Existing literature on Question Answering (QA) mostly focuses on algorithmic novelty, data augmentation, or increasingly large pre-trained language models like XLNet and RoBERTa.

Data Augmentation Natural Questions +2

Span Selection Pre-training for Question Answering

1 code implementation ACL 2020 Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avirup Sil

BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA).

Language Modelling Memorization +4

CFO: A Framework for Building Production NLP Systems

no code implementations IJCNLP 2019 Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos, Avirup Sil

This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments.

Information Retrieval Machine Reading Comprehension +2

Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

no code implementations ACL 2018 Gourab Kundu, Avirup Sil, Radu Florian, Wael Hamza

We propose an entity-centric neural cross-lingual coreference model that builds on multi-lingual embeddings and language-independent features.

coreference-resolution Entity Linking

One for All: Towards Language Independent Named Entity Linking

no code implementations ACL 2016 Avirup Sil, Radu Florian

Entity linking (EL) is the task of disambiguating mentions in text by associating them with entries in a predefined database of mentions (persons, organizations, etc).

Entity Linking

Neural Cross-Lingual Entity Linking

no code implementations5 Dec 2017 Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts.

Cross-Lingual Entity Linking Entity Disambiguation +3

Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures

no code implementations EMNLP 2017 Lifu Huang, Avirup Sil, Heng Ji, Radu Florian

Slot Filling (SF) aims to extract the values of certain types of attributes (or slots, such as person:cities\_of\_residence) for a given entity from a large collection of source documents.

Relation Extraction Sentence +2

Toward Mention Detection Robustness with Recurrent Neural Networks

no code implementations24 Feb 2016 Thien Huu Nguyen, Avirup Sil, Georgiana Dinu, Radu Florian

One of the key challenges in natural language processing (NLP) is to yield good performance across application domains and languages.

named-entity-recognition Named Entity Recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.