Search Results for author: Mrinmaya Sachan

Found 107 papers, 65 papers with code

“Let Your Characters Tell Their Story”: A Dataset for Character-Centric Narrative Understanding

no code implementations Findings (EMNLP) 2021 Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters’ roles, personalities, relationships, intents, actions, etc.

AI-Assisted Human Evaluation of Machine Translation

no code implementations18 Jun 2024 Vilém Zouhar, Tom Kocmi, Mrinmaya Sachan

Annually, research teams spend large amounts of money to evaluate the quality of machine translation systems (WMT, inter alia).

What Do Language Models Learn in Context? The Structured Task Hypothesis

1 code implementation6 Jun 2024 Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell

Large language models (LLMs) exhibit an intriguing ability to learn a novel task from in-context examples presented in a demonstration, termed in-context learning (ICL).

In-Context Learning Meta-Learning +2

On Affine Homotopy between Language Encoders

no code implementations4 Jun 2024 Robin SM Chan, Reda Boumasmoud, Anej Svete, Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Mennatallah El-Assady, Ryan Cotterell

In this spirit, we study the properties of \emph{affine} alignment of language encoders and its implications on extrinsic similarity.

Implicit Personalization in Language Models: A Systematic Study

1 code implementation23 May 2024 Zhijing Jin, Nils Heil, Jiarui Liu, Shehzaad Dhuliawala, Yahang Qi, Bernhard Schölkopf, Rada Mihalcea, Mrinmaya Sachan

This work systematically studies IP through a rigorous mathematical formulation, a multi-perspective moral reasoning framework, and a set of case studies.


Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents

1 code implementation25 Apr 2024 Giorgio Piatti, Zhijing Jin, Max Kleiman-Weiner, Bernhard Schölkopf, Mrinmaya Sachan, Rada Mihalcea

This paper introduces the Governance of the Commons Simulation (GovSim), a generative simulation platform designed to study strategic interactions and cooperative decision-making in LLMs.

Decision Making Specificity

Calibrating Large Language Models with Sample Consistency

no code implementations21 Feb 2024 Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch

Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application.

Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals

2 code implementations18 Feb 2024 Francesco Ortu, Zhijing Jin, Diego Doimo, Mrinmaya Sachan, Alberto Cazzaniga, Bernhard Schölkopf

Interpretability research aims to bridge the gap between empirical success and our scientific understanding of the inner workings of large language models (LLMs).

AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators

1 code implementation16 Feb 2024 Jingwei Ni, Minjing Shi, Dominik Stammbach, Mrinmaya Sachan, Elliott Ash, Markus Leippold

With the rise of generative AI, automated fact-checking methods to combat misinformation are becoming more and more important.

Fact Checking Misinformation

AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails

1 code implementation14 Feb 2024 Sankalan Pal Chowdhury, Vilém Zouhar, Mrinmaya Sachan

Large Language Models (LLMs) have found several use cases in education, ranging from automatic question generation to essay evaluation.

Language Modelling Math +2

CLadder: Assessing Causal Reasoning in Language Models

1 code implementation NeurIPS 2023 Zhijing Jin, Yuen Chen, Felix Leeb, Luigi Gresele, Ojasv Kamal, Zhiheng Lyu, Kevin Blin, Fernando Gonzalez Adauto, Max Kleiman-Weiner, Mrinmaya Sachan, Bernhard Schölkopf

Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordance with a set of well-defined formal rules.

Causal Inference Commonsense Causal Reasoning +1

RELIC: Investigating Large Language Model Responses using Self-Consistency

no code implementations28 Nov 2023 Furui Cheng, Vilém Zouhar, Simran Arora, Mrinmaya Sachan, Hendrik Strobelt, Mennatallah El-Assady

To address this challenge, we propose an interactive system that helps users gain insight into the reliability of the generated text.

Language Modelling Large Language Model

Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis

1 code implementation15 Nov 2023 David F. Jenny, Yann Billeter, Mrinmaya Sachan, Bernhard Schölkopf, Zhijing Jin

To operationalize this framework, we propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the LLM decision process.

Decision Making Fairness

The ART of LLM Refinement: Ask, Refine, and Trust

no code implementations14 Nov 2023 Kumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ram Pasunuru, Mrinmaya Sachan, Jason Weston, Asli Celikyilmaz

In recent years, Large Language Models (LLMs) have demonstrated remarkable generative abilities, but can they judge the quality of their own generations?

Arithmetic Reasoning GSM8K +2

CausalCite: A Causal Formulation of Paper Citations

1 code implementation5 Nov 2023 Ishan Kumar, Zhijing Jin, Ehsan Mokhtarian, Siyuan Guo, Yuen Chen, Mrinmaya Sachan, Bernhard Schölkopf

Thus, we propose CausalCite, a new way to measure the significance of a paper by assessing the causal impact of the paper on its follow-up papers.

Causal Inference counterfactual

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

2 code implementations23 Oct 2023 Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan

We show that MechanisticProbe is able to detect the information of the reasoning tree from the model's attentions for most examples, suggesting that the LM indeed is going through a process of multi-step reasoning within its architecture in many cases.

A Diachronic Perspective on User Trust in AI under Uncertainty

1 code implementation20 Oct 2023 Shehzaad Dhuliawala, Vilém Zouhar, Mennatallah El-Assady, Mrinmaya Sachan

In a human-AI collaboration, users build a mental model of the AI system based on its reliability and how it presents its decision, e. g. its presentation of system confidence and an explanation of the output.

Agents: An Open-source Framework for Autonomous Language Agents

1 code implementation14 Sep 2023 Wangchunshu Zhou, Yuchen Eleanor Jiang, Long Li, Jialong Wu, Tiannan Wang, Shi Qiu, Jintian Zhang, Jing Chen, Ruipu Wu, Shuai Wang, Shiding Zhu, Jiyu Chen, Wentao Zhang, Xiangru Tang, Ningyu Zhang, Huajun Chen, Peng Cui, Mrinmaya Sachan

Recent advances on large language models (LLMs) enable researchers and developers to build autonomous language agents that can automatically solve various tasks and interact with environments, humans, and other agents using natural language interfaces.

A Formal Perspective on Byte-Pair Encoding

1 code implementation29 Jun 2023 Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell

Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.

Combinatorial Optimization

Can Large Language Models Infer Causation from Correlation?

1 code implementation9 Jun 2023 Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, Bernhard Schölkopf

In this work, we propose the first benchmark dataset to test the pure causal inference skills of large language models (LLMs).

Causal Inference

World Models for Math Story Problems

1 code implementation7 Jun 2023 Andreas Opedal, Niklas Stoehr, Abulhair Saparov, Mrinmaya Sachan

In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems.


Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

no code implementations5 Jun 2023 Mattia Atzeni, Mrinmaya Sachan, Andreas Loukas

As a step towards this goal, we focus on geometry priors and introduce LatFormer, a model that incorporates lattice symmetry priors in attention masks.

Adaptive and Personalized Exercise Generation for Online Language Learning

1 code implementation4 Jun 2023 Peng Cui, Mrinmaya Sachan

We train and evaluate our model on real-world learner interaction data from Duolingo and demonstrate that LMs guided by student states can generate superior exercises.

Knowledge Tracing Text Generation

Membership Inference Attacks against Language Models via Neighbourhood Comparison

1 code implementation29 May 2023 Justus Mattern, FatemehSadat Mireshghallah, Zhijing Jin, Bernhard Schölkopf, Mrinmaya Sachan, Taylor Berg-Kirkpatrick

To investigate whether this fragility provides a layer of safety, we propose and evaluate neighbourhood attacks, which compare model scores for a given sample to scores of synthetically generated neighbour texts and therefore eliminate the need for access to the training data distribution.

Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective

no code implementations24 May 2023 Tianyu Liu, Afra Amini, Mrinmaya Sachan, Ryan Cotterell

We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by casting the relation between tokens as a partial order over the string.

coreference-resolution Dependency Parsing +1

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

1 code implementation24 May 2023 Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan

Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture.

Arithmetic Reasoning Mathematical Reasoning +2

MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

1 code implementation23 May 2023 Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

While automatic dialogue tutors hold great potential in making education personalized and more accessible, research on such systems has been hampered by a lack of sufficiently large and high-quality datasets.

Language Modelling Large Language Model +1

All Roads Lead to Rome? Exploring the Invariance of Transformers' Representations

1 code implementation23 May 2023 Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Ryan Cotterell

Transformer models bring propelling advances in various NLP tasks, thus inducing lots of interpretability research on the learned representations of the models.

When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP

2 code implementations23 May 2023 Jingwei Ni, Zhijing Jin, Qian Wang, Mrinmaya Sachan, Markus Leippold

Due to the task difficulty and data scarcity in the Financial NLP domain, we explore when aggregating such diverse skills from multiple datasets with MTL can work.

Multi-Task Learning Open-Ended Question Answering +1

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

2 code implementations22 May 2023 Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan

In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers.

Language Modelling Large Language Model

Efficient Prompting via Dynamic In-Context Learning

no code implementations18 May 2023 Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

To achieve this, we train a meta controller that predicts the number of in-context examples suitable for the generalist model to make a good prediction based on the performance-efficiency trade-off for a specific input.

In-Context Learning

Variational Classification

1 code implementation17 May 2023 Shehzaad Dhuliawala, Mrinmaya Sachan, Carl Allen

We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers.

Adversarial Robustness text-classification +1

Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good

1 code implementation9 May 2023 Fernando Gonzalez, Zhijing Jin, Bernhard Schölkopf, Tom Hope, Mrinmaya Sachan, Rada Mihalcea

Using state-of-the-art NLP models, we address each of these tasks and use them on the entire ACL Anthology, resulting in a visualization workspace that gives researchers a comprehensive overview of the field of NLP4SG.

Psychologically-Inspired Causal Prompts

1 code implementation2 May 2023 Zhiheng Lyu, Zhijing Jin, Justus Mattern, Rada Mihalcea, Mrinmaya Sachan, Bernhard Schoelkopf

In this work, we take sentiment classification as an example and look into the causal relations between the review (X) and sentiment (Y).

Sentiment Analysis Sentiment Classification

Controlled Text Generation with Natural Language Instructions

1 code implementation27 Apr 2023 Wangchunshu Zhou, Yuchen Eleanor Jiang, Ethan Wilcox, Ryan Cotterell, Mrinmaya Sachan

Large language models generate fluent texts and can follow natural language instructions to solve a wide range of tasks without task-specific training.

In-Context Learning Language Modelling +1

Enhancing Textbooks with Visuals from the Web for Improved Learning

1 code implementation18 Apr 2023 Janvijay Singh, Vilém Zouhar, Mrinmaya Sachan

We release the dataset of textbooks with an associated image bank to inspire further research in this intersectional area of computer vision and NLP for education.


Elastic Weight Removal for Faithful and Abstractive Dialogue Generation

1 code implementation30 Mar 2023 Nico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych, Edoardo M. Ponti

We evaluate our method -- using different variants of Flan-T5 as a backbone language model -- on multiple datasets for information-seeking dialogue generation and compare our method with state-of-the-art techniques for faithfulness, such as CTRL, Quark, DExperts, and Noisy Channel reranking.

Dialogue Generation Language Modelling

Strategize Before Teaching: A Conversational Tutoring System with Pedagogy Self-Distillation

no code implementations27 Feb 2023 Lingzhi Wang, Mrinmaya Sachan, Xingshan Zeng, Kam-Fai Wong

Conversational tutoring systems (CTSs) aim to help students master educational material with natural language interaction in the form of a dialog.

Response Generation

Opportunities and Challenges in Neural Dialog Tutoring

1 code implementation24 Jan 2023 Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors.

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

no code implementations20 Dec 2022 Justus Mattern, Zhijing Jin, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

Generated texts from large pretrained language models have been shown to exhibit a variety of harmful, human-like biases about various demographics.


Distilling Reasoning Capabilities into Smaller Language Models

1 code implementation1 Dec 2022 Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan

In this work, we propose an alternative reasoning scheme, Socratic CoT, that learns a decomposition of the original problem into a sequence of subproblems and uses it to guide the intermediate reasoning steps.

GSM8K Knowledge Distillation +2

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

1 code implementation23 Nov 2022 Kumar Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan

On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver.

Math Math Word Problem Solving +2

Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

1 code implementation29 Oct 2022 Yu Fei, Ping Nie, Zhao Meng, Roger Wattenhofer, Mrinmaya Sachan

We further explore the applicability of our clustering approach by evaluating it on 14 datasets with more diverse topics, text lengths, and numbers of classes.

Clustering Sentence +7

A Bilingual Parallel Corpus with Discourse Annotations

1 code implementation26 Oct 2022 Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

The BWB corpus consists of Chinese novels translated by experts into English, and the annotated test set is designed to probe the ability of machine translation systems to model various discourse phenomena.

Document Level Machine Translation Machine Translation +2

Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

no code implementations26 Oct 2022 Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations.

coreference-resolution World Knowledge

Autoregressive Structured Prediction with Language Models

1 code implementation26 Oct 2022 Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya Sachan

Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks.

 Ranked #1 on Relation Extraction on CoNLL04 (RE+ Micro F1 metric)

Named Entity Recognition Named Entity Recognition (NER) +2

Differentially Private Language Models for Secure Data Sharing

no code implementations25 Oct 2022 Justus Mattern, Zhijing Jin, Benjamin Weggenmann, Bernhard Schoelkopf, Mrinmaya Sachan

To protect the privacy of individuals whose data is being shared, it is of high importance to develop methods allowing researchers and companies to release textual data while providing formal privacy guarantees to its originators.

Language Modelling

Adapters for Enhanced Modeling of Multilingual Knowledge and Text

1 code implementation24 Oct 2022 Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu, Mrinmaya Sachan

Specifically, we introduce a lightweight adapter set to enhance MLLMs with cross-lingual entity alignment and facts from MLKGs for many languages.

Entity Alignment

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

1 code implementation21 Oct 2022 Alessandro Stolfo, Zhijing Jin, Kumar Shridhar, Bernhard Schölkopf, Mrinmaya Sachan

By grounding the behavioral analysis in a causal graph describing an intuitive reasoning process, we study the behavior of language models in terms of robustness and sensitivity to direct interventions in the input space.

Math Mathematical Reasoning

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

1 code implementation4 Oct 2022 Zhijing Jin, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf

Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments.

Language Modelling Large Language Model +1

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

no code implementations26 Sep 2022 Đorđe Miladinović, Kumar Shridhar, Kushal Jain, Max B. Paulus, Joachim M. Buhmann, Mrinmaya Sachan, Carl Allen

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning.

Decoder Representation Learning

Probing via Prompting

1 code implementation NAACL 2022 Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan

We then examine the usefulness of a specific linguistic property for pre-training by removing the heads that are essential to that property and evaluating the resulting model's performance on language modeling.

Language Modelling

A Structured Span Selector

1 code implementation NAACL 2022 Tianyu Liu, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

Many natural language processing tasks, e. g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them.

coreference-resolution Inductive Bias +1

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

1 code implementation NAACL 2022 Jingwei Ni, Zhijing Jin, Markus Freitag, Mrinmaya Sachan, Bernhard Schölkopf

We show that these two factors have a large causal effect on the MT performance, in addition to the test-model direction mismatch highlighted by existing work on the impact of translationese.

Machine Translation Translation

Calibration of Machine Reading Systems at Scale

no code implementations Findings (ACL) 2022 Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das, Mrinmaya Sachan

We show that calibrating such complex systems which contain discrete retrieval and deep reading components is challenging and current calibration techniques fail to scale to these settings.

Claim Verification Open-Domain Question Answering +2

Logical Fallacy Detection

2 code implementations28 Feb 2022 Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate change claims (LogicClimate).

Language Modelling Logical Fallacies +2

What Has Been Enhanced in my Knowledge-Enhanced Language Model?

1 code implementation2 Feb 2022 Yifan Hou, Guoji Fu, Mrinmaya Sachan

We conduct experiments to verify that our GCS can indeed be used to correctly interpret the KI process, and we use it to analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them.

Graph Attention Language Modelling

Case-based Reasoning for Better Generalization in Textual Reinforcement Learning

no code implementations ICLR 2022 Mattia Atzeni, Shehzaad Dhuliawala, Keerthiram Murugesan, Mrinmaya Sachan

Text-based games (TBG) have emerged as promising environments for driving research in grounded language understanding and studying problems like generalization and sample efficiency.

Out-of-Distribution Generalization reinforcement-learning +2

On Learning the Transformer Kernel

1 code implementation15 Oct 2021 Sankalan Pal Chowdhury, Adamos Solomou, Avinava Dubey, Mrinmaya Sachan

In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers.

Computational Efficiency

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP

1 code implementation EMNLP 2021 Zhijing Jin, Julius von Kügelgen, Jingwei Ni, Tejas Vaidhya, Ayush Kaushal, Mrinmaya Sachan, Bernhard Schölkopf

The principle of independent causal mechanisms (ICM) states that generative processes of real world data consist of independent modules which do not influence or inform each other.

Causal Inference Domain Adaptation

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

no code implementations12 Sep 2021 Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters' roles, personalities, relationships, intents, actions, etc.

Differentiable Subset Pruning of Transformer Heads

2 code implementations10 Aug 2021 Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan

Multi-head attention, a collection of several attention mechanisms that independently attend to different parts of the input, is the key ingredient in the Transformer.

Machine Translation Natural Language Inference +1

Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks

1 code implementation Findings (NAACL) 2022 Zhao Meng, Yihan Dong, Mrinmaya Sachan, Roger Wattenhofer

In this paper, we present an approach to improve the robustness of BERT language models against word substitution-based adversarial attacks by leveraging adversarial perturbations for self-supervised contrastive learning.

Adversarial Attack Contrastive Learning +1

How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

2 code implementations Findings (ACL) 2021 Zhijing Jin, Geeticka Chauhan, Brian Tse, Mrinmaya Sachan, Rada Mihalcea

We lay the foundations via the moral philosophy definition of social good, propose a framework to evaluate the direct and indirect real-world impact of NLP tasks, and adopt the methodology of global priorities research to identify priority causes for NLP research.


Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach

1 code implementation ACL 2021 Yifan Hou, Mrinmaya Sachan

However, due to the inter-dependence of various phenomena and randomness of training probe models, detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem.

Deep Clustering of Text Representations for Supervision-free Probing of Syntax

no code implementations24 Oct 2020 Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax.

Clustering Deep Clustering +1

Stronger Transformers for Neural Multi-Hop Question Generation

no code implementations22 Oct 2020 Devendra Singh Sachan, Lingfei Wu, Mrinmaya Sachan, William Hamilton

In this work, we introduce a series of strong transformer models for multi-hop question generation, including a graph-augmented transformer that leverages relations between entities in the text.

Question Generation Question-Generation

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Approaches

no code implementations12 Jul 2020 Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

We introduce a number of RL agents that combine the sequential context with a dynamic graph representation of their beliefs of the world and commonsense knowledge from ConceptNet in different ways.

Decision Making Reinforcement Learning (RL) +1

Knowledge Graph Embedding Compression

no code implementations ACL 2020 Mrinmaya Sachan

Knowledge graph (KG) representation learning techniques that learn continuous embeddings of entities and relations in the KG have become popular in many AI applications.

Knowledge Graph Embedding Representation Learning

Enhancing Text-based Reinforcement Learning Agents with Commonsense Knowledge

no code implementations2 May 2020 Keerthiram Murugesan, Mattia Atzeni, Pushkar Shukla, Mrinmaya Sachan, Pavan Kapanipathi, Kartik Talamadupula

In this paper, we consider the recent trend of evaluating progress on reinforcement learning technology by using text-based environments and games as evaluation environments.

reinforcement-learning Reinforcement Learning (RL)

Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks

no code implementations CL 2019 Mrinmaya Sachan, Avinava Dubey, Eduard H. Hovy, Tom M. Mitchell, Dan Roth, Eric P. Xing

At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information.

Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems

no code implementations NeurIPS 2018 Mrinmaya Sachan, Kumar Avinava Dubey, Tom M. Mitchell, Dan Roth, Eric P. Xing

Finally, we also show how Nuts&Bolts can be used to achieve improvements on a relation extraction task and on the end task of answering Newtonian physics problems.

BIG-bench Machine Learning Relation Extraction

Discourse in Multimedia: A Case Study in Information Extraction

no code implementations13 Nov 2018 Mrinmaya Sachan, Kumar Avinava Dubey, Eduard H. Hovy, Tom M. Mitchell, Dan Roth, Eric P. Xing

At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information.

Contextual Parameter Generation for Universal Neural Machine Translation

1 code implementation EMNLP 2018 Emmanouil Antonios Platanios, Mrinmaya Sachan, Graham Neubig, Tom Mitchell

We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation.

Decoder Domain Adaptation +3

Self-Training for Jointly Learning to Ask and Answer Questions

no code implementations NAACL 2018 Mrinmaya Sachan, Eric Xing

The two tasks of question answering and question generation are usually tackled separately in the NLP literature.

Data Augmentation Question Answering +3

Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition

2 code implementations21 Nov 2017 Devendra Singh Sachan, Pengtao Xie, Mrinmaya Sachan, Eric P. Xing

We also show that BiLM weight transfer leads to a faster model training and the pretrained model requires fewer training examples to achieve a particular F1 score.

Language Modelling named-entity-recognition +3

Learning to Solve Geometry Problems from Natural Language Demonstrations in Textbooks

no code implementations SEMEVAL 2017 Mrinmaya Sachan, Eric Xing

As a case study, we explore the task of learning to solve geometry problems using demonstrative solutions available in textbooks.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.