Search Results for author: Mrinmaya Sachan

Found 59 papers, 25 papers with code

“Let Your Characters Tell Their Story”: A Dataset for Character-Centric Narrative Understanding

no code implementations Findings (EMNLP) 2021 Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters’ roles, personalities, relationships, intents, actions, etc.

Strategize Before Teaching: A Conversational Tutoring System with Pedagogy Self-Distillation

no code implementations27 Feb 2023 Lingzhi Wang, Mrinmaya Sachan, Xingshan Zeng, Kam-Fai Wong

Conversational tutoring systems (CTSs) aim to help students master educational material with natural language interaction in the form of a dialog.

Response Generation

Opportunities and Challenges in Neural Dialog Tutoring

1 code implementation24 Jan 2023 Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors.

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

no code implementations20 Dec 2022 Justus Mattern, Zhijing Jin, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

Generated texts from large pretrained language models have been shown to exhibit a variety of harmful, human-like biases about various demographics.

Benchmarking

Distilling Multi-Step Reasoning Capabilities of Large Language Models into Smaller Models via Semantic Decompositions

no code implementations1 Dec 2022 Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan

In this paper, we propose a knowledge distillation approach, that leverages the step-by-step CoT reasoning capabilities of larger models and distils these reasoning abilities into smaller models.

GSM8K Knowledge Distillation +1

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

1 code implementation23 Nov 2022 Kumar Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan

On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver.

Math Word Problem Solving Question Generation +1

Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

1 code implementation29 Oct 2022 Yu Fei, Ping Nie, Zhao Meng, Roger Wattenhofer, Mrinmaya Sachan

We further explore the applicability of our clustering approach by evaluating it on 14 datasets with more diverse topics, text lengths, and numbers of classes.

Sentence Embeddings Sentiment Analysis +5

Autoregressive Structured Prediction with Language Models

1 code implementation26 Oct 2022 Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya Sachan

Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks.

 Ranked #1 on Relation Extraction on CoNLL04 (RE+ Micro F1 metric)

Coreference Resolution Named Entity Recognition +3

Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

no code implementations26 Oct 2022 Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations.

coreference-resolution Coreference Resolution

A Bilingual Parallel Corpus with Discourse Annotations

no code implementations26 Oct 2022 Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

The BWB corpus consists of Chinese novels translated by experts into English, and the annotated test set is designed to probe the ability of machine translation systems to model various discourse phenomena.

Document Level Machine Translation Machine Translation +1

Differentially Private Language Models for Secure Data Sharing

no code implementations25 Oct 2022 Justus Mattern, Zhijing Jin, Benjamin Weggenmann, Bernhard Schoelkopf, Mrinmaya Sachan

To protect the privacy of individuals whose data is being shared, it is of high importance to develop methods allowing researchers and companies to release textual data while providing formal privacy guarantees to its originators.

Language Modelling

Adapters for Enhanced Modeling of Multilingual Knowledge and Text

1 code implementation24 Oct 2022 Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu, Mrinmaya Sachan

Specifically, we introduce a lightweight adapter set to enhance MLLMs with cross-lingual entity alignment and facts from MLKGs for many languages.

Entity Alignment

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

1 code implementation21 Oct 2022 Alessandro Stolfo, Zhijing Jin, Kumar Shridhar, Bernhard Schölkopf, Mrinmaya Sachan

By grounding the behavioral analysis in a causal graph describing an intuitive reasoning process, we study the behavior of language models in terms of robustness and sensitivity to direct interventions in the input space.

Mathematical Reasoning

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

1 code implementation4 Oct 2022 Zhijing Jin, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf

Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments.

Language Modelling Question Answering

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

no code implementations26 Sep 2022 Đorđe Miladinović, Kumar Shridhar, Kushal Jain, Max B. Paulus, Joachim M. Buhmann, Mrinmaya Sachan, Carl Allen

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning.

Representation Learning

Probing via Prompting

1 code implementation NAACL 2022 Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan

We then examine the usefulness of a specific linguistic property for pre-training by removing the heads that are essential to that property and evaluating the resulting model's performance on language modeling.

Language Modelling

A Structured Span Selector

1 code implementation NAACL 2022 Tianyu Liu, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

Many natural language processing tasks, e. g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them.

coreference-resolution Coreference Resolution +2

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

1 code implementation NAACL 2022 Jingwei Ni, Zhijing Jin, Markus Freitag, Mrinmaya Sachan, Bernhard Schölkopf

We show that these two factors have a large causal effect on the MT performance, in addition to the test-model direction mismatch highlighted by existing work on the impact of translationese.

Machine Translation Translation

Calibration of Machine Reading Systems at Scale

no code implementations Findings (ACL) 2022 Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das, Mrinmaya Sachan

We show that calibrating such complex systems which contain discrete retrieval and deep reading components is challenging and current calibration techniques fail to scale to these settings.

Claim Verification Open-Domain Question Answering +2

Logical Fallacy Detection

1 code implementation28 Feb 2022 Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate change claims (LogicClimate).

Language Modelling Logical Fallacies +2

What Has Been Enhanced in my Knowledge-Enhanced Language Model?

1 code implementation2 Feb 2022 Yifan Hou, Guoji Fu, Mrinmaya Sachan

We conduct experiments to verify that our GCS can indeed be used to correctly interpret the KI process, and we use it to analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them.

Graph Attention Language Modelling

Case-based Reasoning for Better Generalization in Textual Reinforcement Learning

no code implementations ICLR 2022 Mattia Atzeni, Shehzaad Dhuliawala, Keerthiram Murugesan, Mrinmaya Sachan

Text-based games (TBG) have emerged as promising environments for driving research in grounded language understanding and studying problems like generalization and sample efficiency.

Out-of-Distribution Generalization reinforcement-learning +2

On Learning the Transformer Kernel

1 code implementation15 Oct 2021 Sankalan Pal Chowdhury, Adamos Solomou, Avinava Dubey, Mrinmaya Sachan

In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers.

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP

1 code implementation EMNLP 2021 Zhijing Jin, Julius von Kügelgen, Jingwei Ni, Tejas Vaidhya, Ayush Kaushal, Mrinmaya Sachan, Bernhard Schölkopf

The principle of independent causal mechanisms (ICM) states that generative processes of real world data consist of independent modules which do not influence or inform each other.

Causal Inference Domain Adaptation

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

no code implementations12 Sep 2021 Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters' roles, personalities, relationships, intents, actions, etc.

Differentiable Subset Pruning of Transformer Heads

2 code implementations10 Aug 2021 Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan

Multi-head attention, a collection of several attention mechanisms that independently attend to different parts of the input, is the key ingredient in the Transformer.

Machine Translation Natural Language Inference +1

Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks

1 code implementation Findings (NAACL) 2022 Zhao Meng, Yihan Dong, Mrinmaya Sachan, Roger Wattenhofer

In this paper, we present an approach to improve the robustness of BERT language models against word substitution-based adversarial attacks by leveraging adversarial perturbations for self-supervised contrastive learning.

Adversarial Attack Contrastive Learning +1

How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

2 code implementations Findings (ACL) 2021 Zhijing Jin, Geeticka Chauhan, Brian Tse, Mrinmaya Sachan, Rada Mihalcea

We lay the foundations via the moral philosophy definition of social good, propose a framework to evaluate the direct and indirect real-world impact of NLP tasks, and adopt the methodology of global priorities research to identify priority causes for NLP research.

Philosophy

Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach

1 code implementation ACL 2021 Yifan Hou, Mrinmaya Sachan

However, due to the inter-dependence of various phenomena and randomness of training probe models, detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem.

Deep Clustering of Text Representations for Supervision-free Probing of Syntax

no code implementations24 Oct 2020 Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax.

Deep Clustering TAG

Stronger Transformers for Neural Multi-Hop Question Generation

no code implementations22 Oct 2020 Devendra Singh Sachan, Lingfei Wu, Mrinmaya Sachan, William Hamilton

In this work, we introduce a series of strong transformer models for multi-hop question generation, including a graph-augmented transformer that leverages relations between entities in the text.

Question Generation Question-Generation

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Approaches

no code implementations12 Jul 2020 Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

We introduce a number of RL agents that combine the sequential context with a dynamic graph representation of their beliefs of the world and commonsense knowledge from ConceptNet in different ways.

Decision Making Reinforcement Learning (RL) +1

Knowledge Graph Embedding Compression

no code implementations ACL 2020 Mrinmaya Sachan

Knowledge graph (KG) representation learning techniques that learn continuous embeddings of entities and relations in the KG have become popular in many AI applications.

Knowledge Graph Embedding

Enhancing Text-based Reinforcement Learning Agents with Commonsense Knowledge

no code implementations2 May 2020 Keerthiram Murugesan, Mattia Atzeni, Pushkar Shukla, Mrinmaya Sachan, Pavan Kapanipathi, Kartik Talamadupula

In this paper, we consider the recent trend of evaluating progress on reinforcement learning technology by using text-based environments and games as evaluation environments.

reinforcement-learning Reinforcement Learning (RL)

Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks

no code implementations CL 2019 Mrinmaya Sachan, Avinava Dubey, Eduard H. Hovy, Tom M. Mitchell, Dan Roth, Eric P. Xing

At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information.

Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems

no code implementations NeurIPS 2018 Mrinmaya Sachan, Kumar Avinava Dubey, Tom M. Mitchell, Dan Roth, Eric P. Xing

Finally, we also show how Nuts&Bolts can be used to achieve improvements on a relation extraction task and on the end task of answering Newtonian physics problems.

BIG-bench Machine Learning Relation Extraction

Discourse in Multimedia: A Case Study in Information Extraction

no code implementations13 Nov 2018 Mrinmaya Sachan, Kumar Avinava Dubey, Eduard H. Hovy, Tom M. Mitchell, Dan Roth, Eric P. Xing

At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information.

Contextual Parameter Generation for Universal Neural Machine Translation

1 code implementation EMNLP 2018 Emmanouil Antonios Platanios, Mrinmaya Sachan, Graham Neubig, Tom Mitchell

We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation.

Domain Adaptation Machine Translation +2

Self-Training for Jointly Learning to Ask and Answer Questions

no code implementations NAACL 2018 Mrinmaya Sachan, Eric Xing

The two tasks of question answering and question generation are usually tackled separately in the NLP literature.

Data Augmentation Question Answering +2

Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition

2 code implementations21 Nov 2017 Devendra Singh Sachan, Pengtao Xie, Mrinmaya Sachan, Eric P. Xing

We also show that BiLM weight transfer leads to a faster model training and the pretrained model requires fewer training examples to achieve a particular F1 score.

Language Modelling named-entity-recognition +3

Learning to Solve Geometry Problems from Natural Language Demonstrations in Textbooks

no code implementations SEMEVAL 2017 Mrinmaya Sachan, Eric Xing

As a case study, we explore the task of learning to solve geometry problems using demonstrative solutions available in textbooks.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.