1 code implementation • ACL 2022 • Mounica Maddela, Mayank Kulkarni, Daniel Preotiuc-Pietro
Controllable summarization aims to provide summaries that take into account user-specified aspects and preferences to better assist them with their information need, as opposed to the standard summarization setup which build a single generic summary of a document. We introduce a human-annotated data set EntSUM for controllable summarization with a focus on named entities as the aspects to control. We conduct an extensive quantitative analysis to motivate the task of entity-centric summarization and show that existing methods for controllable summarization fail to generate entity-centric summaries.
no code implementations • 19 May 2024 • Sanchit Sinha, Yuguang Yue, Victor Soto, Mayank Kulkarni, Jianhua Lu, Aidong Zhang
In this paper, we propose MAML-en-LLM, a novel method for meta-training LLMs, which can learn truly generalizable parameters that not only perform well on disjointed tasks but also adapts to unseen tasks.
no code implementations • 25 May 2023 • Genta Indra Winata, Lingjue Xie, Karthik Radhakrishnan, Shijie Wu, Xisen Jin, Pengxiang Cheng, Mayank Kulkarni, Daniel Preotiuc-Pietro
Real-life multilingual systems should be able to efficiently incorporate new languages as data distributions fed to the system evolve and shift over time.
1 code implementation • 5 Apr 2022 • Mounica Maddela, Mayank Kulkarni, Daniel Preotiuc-Pietro
Our analysis and results show the challenging nature of this task and of the proposed data set.
1 code implementation • Findings (NAACL) 2022 • Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
In the discriminative setting, we introduce a new pre-training objective - Keyphrase Boundary Infilling with Replacement (KBIR), showing large gains in performance (upto 8. 16 points in F1) over SOTA, when the LM pre-trained using KBIR is fine-tuned for the task of keyphrase extraction.
no code implementations • ACL 2020 • Jing Wang, Mayank Kulkarni, Daniel Preotiuc-Pietro
Named entity recognition is a key component of many text processing pipelines and it is thus essential for this component to be robust to different types of input.
no code implementations • 19 Oct 2019 • Dhruva Sahrawat, Debanjan Mahata, Mayank Kulkarni, Haimin Zhang, Rakesh Gosangi, Amanda Stent, Agniv Sharma, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann
In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings.
no code implementations • WS 2018 • Mayank Kulkarni, Kristy Boyer
This paper reports on the creation of a dataset that could support building such a tutorial question answering system and discusses the methodology to create the 106, 386 question strong dataset.