Search Results for author: Mihai Surdeanu

Found 96 papers, 22 papers with code

Multi-instance Multi-label Learning for Relation Extraction

no code implementations • EMNLP 2012 • Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, Christopher D. Manning

Ranked #7 on Relation Extraction on NYT Corpus

Multi-Label Learning Relation +1

Paper
Add Code

Joint Entity and Event Coreference Resolution across Documents

no code implementations • EMNLP 2012 • Heeyoung Lee, Marta Recasens, Angel Chang, Mihai Surdeanu, Dan Jurafsky

coreference-resolution Event Coreference Resolution

Paper
Add Code

Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules

no code implementations • CL 2013 • Heeyoung Lee, Angel Chang, Yves Peirsman, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky

coreference-resolution

Paper
Add Code

Selectional Preferences for Semantic Role Classification

no code implementations • CL 2013 • Be{\~n}at Zapirain, Eneko Agirre, Llu{\'\i}s M{\`a}rquez, Mihai Surdeanu

Classification General Classification +1

Paper
Add Code

On the Importance of Text Analysis for Stock Price Prediction

no code implementations • LREC 2014 • Heeyoung Lee, Mihai Surdeanu, Bill MacCartney, Dan Jurafsky

We investigate the importance of text analysis for stock price prediction.

Stock Prediction Stock Price Prediction

Paper
Add Code

Event Extraction Using Distant Supervision

no code implementations • LREC 2014 • Kevin Reschke, Martin Jankowiak, Mihai Surdeanu, Christopher Manning, Daniel Jurafsky

We present a new publicly available dataset and event extraction task in the plane crash domain based on Wikipedia infoboxes and newswire text.

Event Extraction Knowledge Base Population +2

Paper
Add Code

The Stanford CoreNLP Natural Language Processing Toolkit

1 code implementation • ACL 2014 • Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, David McClosky

Coreference Resolution Dependency Parsing +5

9,460

Paper
Code

Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

no code implementations • ACL 2014 • Peter Jansen, Mihai Surdeanu, Peter Clark

Community Question Answering Semantic Textual Similarity

Paper
Add Code

Extracting Latent Attributes from Video Scenes Using Text as Background Knowledge

no code implementations • SEMEVAL 2014 • Anh Tran, Mihai Surdeanu, Paul Cohen

Coreference Resolution Information Retrieval

Paper
Add Code

Analyzing the Language of Food on Social Media

no code implementations • 8 Sep 2014 • Daniel Fried, Mihai Surdeanu, Stephen Kobourov, Melanie Hingle, Dane Bell

We investigate the predictive power behind the language of food on social media.

Paper
Add Code

Higher-order Lexical Semantic Models for Non-factoid Answer Reranking

no code implementations • TACL 2015 • Daniel Fried, Peter Jansen, Gustave Hahn-Powell, Mihai Surdeanu, Peter Clark

We introduce a higher-order formalism that allows all these lexical semantic models to chain direct evidence to construct indirect associations between question and answer texts, by casting the task as the traversal of graphs that encode direct term associations.

Open-Domain Question Answering Semantic Similarity +1

Paper
Add Code

Spinning Straw into Gold: Using Free Text to Train Monolingual Alignment Models for Non-factoid Question Answering

no code implementations • HLT 2015 • Peter Clark, Peter Jansen, Mihai Surdeanu, Rebecca Sharp

Discourse Parsing Information Retrieval +4

Paper
Add Code

Diamonds in the Rough: Event Extraction from Imperfect Microblog Data

no code implementations • HLT 2015 • Eneko Agirre, Oier Lopez de Lacalle, Mihai Surdeanu, Ander Intxaurrondo

Event Extraction

Paper
Add Code

Two Practical Rhetorical Structure Theory Parsers

no code implementations • NAACL 2015 • Mihai Surdeanu, Tom Hicks, Marco Antonio Valenzuela-Esc{\'a}rcega

Ranked #10 on Discourse Parsing on RST-DT

Discourse Parsing Vocal Bursts Valence Prediction

Paper
Add Code

A Domain-independent Rule-based Framework for Event Extraction

no code implementations • IJCNLP 2015 • Marco A. Valenzuela-Esc{\'a}rcega, Gus Hahn-Powell, Mihai Surdeanu, Thomas Hicks

Event Extraction

Paper
Add Code

Description of the Odin Event Extraction Framework and Rule Language

1 code implementation • 24 Sep 2015 • Marco A. Valenzuela-Escárcega, Gus Hahn-Powell, Mihai Surdeanu

Here we include a thorough definition of the Odin rule language, together with a description of the Odin API in the Scala language, which allows one to apply these rules to arbitrary texts.

Event Extraction

Paper
Code

Towards using social media to identify individuals at risk for preventable chronic illness

no code implementations • LREC 2016 • Dane Bell, Daniel Fried, Luwen Huangfu, Mihai Surdeanu, Stephen Kobourov

The strategy uses a game-like quiz with data and questions acquired semi-automatically from Twitter.

Paper
Add Code

Sieve-based Coreference Resolution in the Biomedical Domain

no code implementations • LREC 2016 • Dane Bell, Gus Hahn-Powell, Marco A. Valenzuela-Escárcega, Mihai Surdeanu

We describe challenges and advantages unique to coreference resolution in the biomedical domain, and a sieve-based architecture that leverages domain knowledge for both entity and event coreference resolution.

coreference-resolution Event Coreference Resolution +1

Paper
Add Code

Odin's Runes: A Rule Language for Information Extraction

no code implementations • LREC 2016 • Marco A. Valenzuela-Esc{\'a}rcega, Gus Hahn-Powell, Mihai Surdeanu

Odin is an information extraction framework that applies cascades of finite state automata over both surface text and syntactic dependency graphs.

Paper
Add Code

This before That: Causal Precedence in the Biomedical Domain

2 code implementations • WS 2016 • Gus Hahn-Powell, Dane Bell, Marco A. Valenzuela-Escárcega, Mihai Surdeanu

Causal precedence between biochemical interactions is crucial in the biomedical domain, because it transforms collections of individual interactions, e. g., bindings and phosphorylations, into the causal mechanisms needed to inform meaningful search and inference.

Paper
Code

SnapToGrid: From Statistical to Interpretable Models for Biomedical Information Extraction

no code implementations • WS 2016 • Marco A. Valenzuela-Escarcega, Gus Hahn-Powell, Dane Bell, Mihai Surdeanu

We propose an approach for biomedical information extraction that marries the advantages of machine learning models, e. g., learning directly from data, with the benefits of rule-based approaches, e. g., interpretability.

BIG-bench Machine Learning Event Extraction

Paper
Add Code

Creating Causal Embeddings for Question Answering with Minimal Supervision

no code implementations • EMNLP 2016 • Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Peter Clark, Michael Hammond

We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings.

Question Answering Word Embeddings

Paper
Add Code

What's in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams

no code implementations • COLING 2016 • Peter Jansen, Niranjan Balasubramanian, Mihai Surdeanu, Peter Clark

These explanations are used to create a fine-grained categorization of the requirements.

Question Answering Retrieval

Paper
Add Code

Framing QA as Building and Ranking Intersentence Answer Justifications

no code implementations • CL 2017 • Peter Jansen, Rebecca Sharp, Mihai Surdeanu, Peter Clark

Our best configuration answers 44{\%} of the questions correctly, where the top justifications for 57{\%} of these correct answers contain a compelling human-readable justification that explains the inference required to arrive at the correct answer.

Multiple-choice Question Answering

Paper
Add Code

Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph

no code implementations • ACL 2017 • Gus Hahn-Powell, Marco A. Valenzuela-Esc{\'a}rcega, Mihai Surdeanu

Reading Comprehension

Paper
Add Code

Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification

no code implementations • CONLL 2017 • Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Marco A. Valenzuela-Esc{\'a}rcega, Peter Clark, Michael Hammond

We propose a neural network architecture for QA that reranks answer justifications as an intermediate (and human-interpretable) step in answer selection.

Ranked #1 on Question Answering on AI2 Kaggle Dataset

Answer Selection Interpretable Machine Learning

Paper
Add Code

Learning what to read: Focused machine reading

no code implementations • EMNLP 2017 • Enrique Noriega-Atala, Marco A. Valenzuela-Escarcega, Clayton T. Morrison, Mihai Surdeanu

In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible.

Reading Comprehension Reinforcement Learning (RL)

Paper
Add Code

Text Annotation Graphs: Annotating Complex Natural Language Phenomena

1 code implementation • LREC 2018 • Angus G. Forbes, Kristine Lee, Gus Hahn-Powell, Marco A. Valenzuela-Escárcega, Mihai Surdeanu

Additionally, we include an approach to representing text annotations in which annotation subgraphs, or semantic summaries, are used to show relationships outside of the sequential context of the text itself.

Event Extraction TAG +1

Paper
Code

Grounding Gradable Adjectives through Crowdsourcing

no code implementations • LREC 2018 • Rebecca Sharp, Mithun Paul, Ajay Nagesh, Dane Bell, Mihai Surdeanu

Reading Comprehension

Paper
Add Code

Bootstrapping Polar-Opposite Emotion Dimensions from Online Reviews

no code implementations • LREC 2018 • Luwen Huangfu, Mihai Surdeanu

Word Embeddings

Paper
Add Code

Lightly-supervised Representation Learning with Global Interpretability

no code implementations • WS 2019 • Marco A. Valenzuela-Escárcega, Ajay Nagesh, Mihai Surdeanu

We propose a lightly-supervised approach for information extraction, in particular named entity classification, which combines the benefits of traditional bootstrapping, i. e., use of limited annotations and interpretability of extraction patterns, with the robust learning approaches proposed in representation learning.

Representation Learning

Paper
Add Code

Scientific Discovery as Link Prediction in Influence and Citation Graphs

no code implementations • WS 2018 • Fan Luo, Marco A. Valenzuela-Esc{\'a}rcega, Gus Hahn-Powell, Mihai Surdeanu

We introduce a machine learning approach for the identification of {``}white spaces{''} in scientific knowledge.

Link Prediction Reading Comprehension

Paper
Add Code

Keep Your Bearings: Lightly-Supervised Information Extraction with Ladder Networks That Avoids Semantic Drift

no code implementations • NAACL 2018 • Ajay Nagesh, Mihai Surdeanu

We propose a novel approach to semi-supervised learning for information extraction that uses ladder networks (Rasmus et al., 2015).

Denoising General Classification

Paper
Add Code

Sanity Check: A Strong Alignment and Information Retrieval Baseline for Question Answering

no code implementations • 5 Jul 2018 • Vikas Yadav, Rebecca Sharp, Mihai Surdeanu

We also achieve 26. 56\% and 58. 36\% on ARC challenge and easy dataset respectively.

Information Retrieval Question Answering +1

Paper
Add Code

An Exploration of Three Lightly-supervised Representation Learning Approaches for Named Entity Classification

no code implementations • COLING 2018 • Ajay Nagesh, Mihai Surdeanu

Several semi-supervised representation learning methods have been proposed recently that mitigate the drawbacks of traditional bootstrapping: they reduce the amount of semantic drift introduced by iterative approaches through one-shot learning; others address the sparsity of data through the learning of custom, dense representation for the information modeled.

General Classification One-Shot Learning +1

Paper
Add Code

Detecting Diabetes Risk from Social Media Activity

no code implementations • WS 2018 • Dane Bell, Egoitz Laparra, Aditya Kousik, Terron Ishihara, Mihai Surdeanu, Stephen Kobourov

This work explores the detection of individuals{'} risk of type 2 diabetes mellitus (T2DM) directly from their social media (Twitter) activity.

Domain Adaptation

Paper
Add Code

Visual Supervision in Bootstrapped Information Extraction

no code implementations • EMNLP 2018 • Matthew Berger, Ajay Nagesh, Joshua Levine, Mihai Surdeanu, Helen Zhang

We challenge a common assumption in active learning, that a list-based interface populated by informative samples provides for efficient and effective data annotation.

Active Learning General Classification

Paper
Add Code

A mostly unlexicalized model for recognizing textual entailment

no code implementations • WS 2018 • Mithun Paul, Rebecca Sharp, Mihai Surdeanu

For example, such a system trained in the news domain may learn that a sentence like {``}Palestinians recognize Texas as part of Mexico{''} tends to be unsupported, but this fact (and its corresponding lexicalized cues) have no value in, say, a scientific domain.

Fake News Detection Information Retrieval +4

Paper
Add Code

Alignment over Heterogeneous Embeddings for Question Answering

1 code implementation • NAACL 2019 • Vikas Yadav, Steven Bethard, Mihai Surdeanu

We propose a simple, fast, and mostly-unsupervised approach for non-factoid question answering (QA) called Alignment over Heterogeneous Embeddings (AHE).

Question Answering Sentence +1

Paper
Code

Enabling Search and Collaborative Assembly of Causal Interactions Extracted from Multilingual and Multi-domain Free Text

no code implementations • NAACL 2019 • George C. G. Barbosa, Zechy Wong, Gus Hahn-Powell, Dane Bell, Rebecca Sharp, Marco A. Valenzuela-Esc{\'a}rcega, Mihai Surdeanu

Many of the most pressing current research problems (e. g., public health, food security, or climate change) require multi-disciplinary collaborations.

Paper
Add Code

Exploration of Noise Strategies in Semi-supervised Named Entity Classification

no code implementations • SEMEVAL 2019 • Pooja Lakshmi Narayan, Ajay Nagesh, Mihai Surdeanu

Our work aims to address this gap by exploring different noise strategies for the semi-supervised named entity classification task, including statistical methods such as adding Gaussian noise to input embeddings, and linguistically-inspired ones such as dropping words and replacing words with their synonyms.

Classification General Classification +1

Paper
Add Code

University of Arizona at SemEval-2019 Task 12: Deep-Affix Named Entity Recognition of Geolocation Entities

no code implementations • SEMEVAL 2019 • Vikas Yadav, Egoitz Laparra, Ti-Tai Wang, Mihai Surdeanu, Steven Bethard

We present the Named Entity Recognition (NER) and disambiguation model used by the University of Arizona team (UArizona) for the SemEval 2019 task 12.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Semi-Supervised Teacher-Student Architecture for Relation Extraction

no code implementations • WS 2019 • Fan Luo, Ajay Nagesh, Rebecca Sharp, Mihai Surdeanu

Generating a large amount of training data for information extraction (IE) is either costly (if annotations are created manually), or runs the risk of introducing noisy instances (if distant supervision is used).

Binary Relation Extraction Denoising +1

Paper
Add Code

Understanding the Polarity of Events in the Biomedical Literature: Deep Learning vs. Linguistically-informed Methods

no code implementations • WS 2019 • Enrique Noriega-Atala, Zhengzhong Liang, John Bachman, Clayton Morrison, Mihai Surdeanu

An important task in the machine reading of biochemical events expressed in biomedical texts is correctly reading the polarity, i. e., attributing whether the biochemical event is a promotion or an inhibition.

Reading Comprehension

Paper
Add Code

Eidos, INDRA, \& Delphi: From Free Text to Executable Causal Models

1 code implementation • NAACL 2019 • Rebecca Sharp, Adarsh Pyarelal, Benjamin Gyori, Keith Alcock, Egoitz Laparra, Marco A. Valenzuela-Esc{\'a}rcega, Ajay Nagesh, Vikas Yadav, John Bachman, Zheng Tang, Heather Lent, Fan Luo, Mithun Paul, Steven Bethard, Kobus Barnard, Clayton Morrison, Mihai Surdeanu

Building causal models of complicated phenomena such as food insecurity is currently a slow and labor-intensive manual process.

Decision Making Reading Comprehension

Paper
Code

On the Importance of Delexicalization for Fact Verification

no code implementations • IJCNLP 2019 • Sandeep Suntwal, Mithun Paul, Rebecca Sharp, Mihai Surdeanu

As expected, even though this method achieves high accuracy when evaluated in the same domain, the performance in the target domain is poor, marginally above chance. To mitigate this dependence on lexicalized information, we experiment with several strategies for masking out names by replacing them with their semantic category, coupled with a unique identifier to mark that the same or new entities are referenced between claim and evidence.

Fact Verification Natural Language Inference +2

Paper
Add Code

What does the language of foods say about us?

no code implementations • WS 2019 • Hoang Van, Ahmad Musa, Hang Chen, Stephen Kobourov, Mihai Surdeanu

Second, we investigate the effect of socioeconomic factors (income, poverty, and education) on predicting state-level T2DM rates.

Paper
Add Code

Quick and (not so) Dirty: Unsupervised Selection of Justification Sentences for Multi-hop Question Answering

no code implementations • IJCNLP 2019 • Vikas Yadav, Steven Bethard, Mihai Surdeanu

We show that the sentences selected by our method improve the performance of a state-of-the-art supervised QA model on two multi-hop QA datasets: AI2's Reasoning Challenge (ARC) and Multi-Sentence Reading Comprehension (MultiRC).

Information Retrieval Multi-hop Question Answering +4

Paper
Add Code

Parsing as Tagging

no code implementations • LREC 2020 • Robert Vacareanu, George Caique Gouveia Barbosa, Marco A. Valenzuela-Esc{\'a}rcega, Mihai Surdeanu

For example, for the sentence John eats cake, the tag to be predicted for the token cake is -1 because its head (eats) occurs one token to the left.

Dependency Parsing Position +2

Paper
Add Code

Towards the Necessity for Debiasing Natural Language Inference Datasets

no code implementations • LREC 2020 • Mithun Paul Panenghat, S Suntwal, eep, Faiz Rafique, Rebecca Sharp, Mihai Surdeanu

Modeling natural language inference is a challenging task.

Natural Language Inference

Paper
Add Code

Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering

1 code implementation • ACL 2020 • Vikas Yadav, Steven Bethard, Mihai Surdeanu

Evidence retrieval is a critical stage of question answering (QA), necessary not only to improve performance, but also to explain the decisions of the corresponding QA method.

Evidence Selection Multi-hop Question Answering +2

Paper
Code

Exploring Interpretability in Event Extraction: Multitask Learning of a Neural Event Classifier and an Explanation Decoder

no code implementations • ACL 2020 • Zheng Tang, Gus Hahn-Powell, Mihai Surdeanu

Our approach uses an encoder-decoder architecture, which jointly trains a classifier for event extraction, and a rule decoder that generates syntactico-semantic rules that explain the decisions of the event classifier.

Event Extraction

Paper
Add Code

Using the Hammer Only on Nails: A Hybrid Method for Evidence Retrieval for Question Answering

no code implementations • 22 Sep 2020 • Zhengzhong Liang, Yiyun Zhao, Mihai Surdeanu

Evidence retrieval is a key component of explainable question answering (QA).

Information Retrieval Question Answering +2

Paper
Add Code

The Language of Food during the Pandemic: Hints about the Dietary Effects of Covid-19

no code implementations • 15 Oct 2020 • Hoang Van, Ahmad Musa, Mihai Surdeanu, Stephen Kobourov

Specifically, we analyze over770, 000 tweets published during the lockdown and the equivalent period in the five previous years and highlight several worrying trends.

Paper
Add Code

An Unsupervised Method for Learning Representations of Multi-word Expressions for Semantic Classification

no code implementations • COLING 2020 • Robert Vacareanu, Marco A. Valenzuela-Esc{\'a}rcega, Rebecca Sharp, Mihai Surdeanu

This paper explores an unsupervised approach to learning a compositional representation function for multi-word expressions (MWEs), and evaluates it on the Tratz dataset, which associates two-word expressions with the semantic relation between the compound constituents (e. g. the label employer is associated with the noun compound government agency) (Tratz, 2011).

Paper
Add Code

Explainable Multi-hop Verbal Reasoning Through Internal Monologue

no code implementations • NAACL 2021 • Zhengzhong Liang, Steven Bethard, Mihai Surdeanu

Moreover, models trained on simpler tasks tend to fail when directly tested on more complex problems.

Language Modelling Question Answering

Paper
Add Code

Data and Model Distillation as a Solution for Domain-transferable Fact Verification

no code implementations • NAACL 2021 • Mitch Paul Mithun, Sandeep Suntwal, Mihai Surdeanu

While neural networks produce state-of-the-art performance in several NLP tasks, they generally depend heavily on lexicalized information, which transfer poorly between domains.

Fact Verification

Paper
Add Code

If You Want to Go Far Go Together: Unsupervised Joint Candidate Evidence Retrieval for Multi-hop Question Answering

no code implementations • NAACL 2021 • Vikas Yadav, Steven Bethard, Mihai Surdeanu

We specifically emphasize on the importance of retrieving evidence jointly by showing several comparative analyses to other methods that retrieve and rerank evidence sentences individually.

Answer Selection Multi-hop Question Answering +1

Paper
Add Code

Cheap and Good? Simple and Effective Data Augmentation for Low Resource Machine Reading

1 code implementation • 8 Jun 2021 • Hoang Van, Vikas Yadav, Mihai Surdeanu

We propose a simple and effective strategy for data augmentation for low-resource machine reading comprehension (MRC).

Data Augmentation Machine Reading Comprehension +1

Paper
Code

How May I Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks

1 code implementation • Findings (EMNLP) 2021 • Hoang Van, Zheng Tang, Mihai Surdeanu

The general goal of text simplification (TS) is to reduce text complexity for human consumption.

Natural Language Inference Relation Extraction +1

Paper
Code

Neural Architectures for Biological Inter-Sentence Relation Extraction

no code implementations • 17 Dec 2021 • Enrique Noriega-Atala, Peter M. Lovett, Clayton T. Morrison, Mihai Surdeanu

We introduce a family of deep-learning architectures for inter-sentence relation extraction, i. e., relations where the participants are not necessarily in the same sentence.

Feature Engineering Relation +2

Paper
Add Code

Informal Persian Universal Dependency Treebank

2 code implementations • LREC 2022 • Roya Kabiri, Simin Karimi, Mihai Surdeanu

We then investigate the parsing of informal Persian by training two dependency parsers on existing formal treebanks and evaluating them on out-of-domain data, i. e. the development set of our informal treebank.

Paper
Code

Automatic Correction of Syntactic Dependency Annotation Differences

no code implementations • LREC 2022 • Andrew Zupon, Andrew Carnie, Michael Hammond, Mihai Surdeanu

Annotation inconsistencies between data sets can cause problems for low-resource NLP, where noisy or inconsistent data cannot be as easily replaced compared with resource-rich languages.

Dependency Parsing TAG

Paper
Add Code

From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction

1 code implementation • LREC 2022 • Robert Vacareanu, Marco A. Valenzuela-Escarcega, George C. G. Barbosa, Rebecca Sharp, Mihai Surdeanu

While deep learning approaches to information extraction have had many successes, they can be difficult to augment or maintain as needs shift.

Enumerative Search Few-Shot Learning +1

Paper
Code

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

1 code implementation • 25 Apr 2022 • Zheng Tang, Mihai Surdeanu

Our approach uses a multi-task learning architecture, which jointly trains a classifier for relation extraction, and a sequence model that labels words in the context of the relation that explain the decisions of the relation classifier.

Multi-Task Learning Relation +1

Paper
Code

Better Retrieval May Not Lead to Better Question Answering

no code implementations • 7 May 2022 • Zhengzhong Liang, Tushar Khot, Steven Bethard, Mihai Surdeanu, Ashish Sabharwal

Considerable progress has been made recently in open-domain question answering (QA) problems, which require Information Retrieval (IR) and Reading Comprehension (RC).

Information Retrieval Open-Domain Question Answering +3

Paper
Add Code

SuMe: A Dataset Towards Summarizing Biomedical Mechanisms

2 code implementations • ACL ARR November 2021 • Mohaddeseh Bastan, Nishant Shankar, Mihai Surdeanu, Niranjan Balasubramanian

We leverage this structure and create a summarization task, where the input is a collection of sentences and the main entities in an abstract, and the output includes the relationship and a sentence that summarizes the mechanism.

Sentence

Paper
Code

Learning Open Domain Multi-hop Search Using Reinforcement Learning

no code implementations • NAACL (SUKI) 2022 • Enrique Noriega-Atala, Mihai Surdeanu, Clayton T. Morrison

We propose a method to teach an automated agent to learn how to search for multi-hop paths of relations between entities in an open domain.

Information Retrieval Reading Comprehension +3

Paper
Add Code

A Compact Pretraining Approach for Neural Language Models

no code implementations • 25 Aug 2022 • Shahriar Golchin, Mihai Surdeanu, Nazgol Tavabi, Ata Kiapour

We construct these compact subsets from the unstructured data using a combination of abstractive summaries and extractive keywords.

Domain Adaptation

Paper
Add Code

BioNLI: Generating a Biomedical NLI Dataset Using Lexico-semantic Constraints for Adversarial Examples

1 code implementation • 26 Oct 2022 • Mohaddeseh Bastan, Mihai Surdeanu, Niranjan Balasubramanian

We introduce a novel semi-supervised procedure that bootstraps an NLI dataset from existing biomedical dataset that pairs mechanisms with experimental evidence in abstracts.

Ranked #1 on Natural Language Inference on BioNLI

Decision Making Natural Language Inference

Paper
Code

Validity Assessment of Legal Will Statements as Natural Language Inference

1 code implementation • 30 Oct 2022 • Alice Saebom Kwak, Jacob O. Israelsen, Clayton T. Morrison, Derek E. Bambauer, Mihai Surdeanu

This work introduces a natural language inference (NLI) dataset that focuses on the validity of statements in legal wills.

Natural Language Inference

Paper
Code

Explainable Verbal Reasoner Plus (EVR+): A Natural Language Reasoning Framework that Supports Diverse Compositional Reasoning

1 code implementation • 28 Apr 2023 • Zhengzhong Liang, Zeyu Zhang, Steven Bethard, Mihai Surdeanu

Languages models have been successfully applied to a variety of reasoning tasks in NLP, yet the language models still suffer from compositional generalization.

Language Modelling

Paper
Code

It is not Sexually Suggestive, It is Educative. Separating Sex Education from Suggestive Content on TikTok Videos

no code implementations • 6 Jul 2023 • Enfa George, Mihai Surdeanu

Such a dataset is necessary to address the challenge of distinguishing between sexually suggestive content and virtual sex education videos on TikTok.

Paper
Add Code

Synthetic Dataset for Evaluating Complex Compositional Knowledge for Natural Language Inference

1 code implementation • 11 Jul 2023 • Sushma Anand Akoju, Robert Vacareanu, Haris Riaz, Eduardo Blanco, Mihai Surdeanu

To this end, we modify the original texts using a set of phrases - modifiers that correspond to universal quantifiers, existential quantifiers, negation, and other concept modifiers in Natural Logic (NL) (MacCartney, 2009).

Natural Language Inference Negation +2

Paper
Code

Improving Zero-shot Relation Classification via Automatically-acquired Entailment Templates

no code implementations • Proceedings of the 8th Workshop on Representation Learning for NLP 2023 • Mahdi Rahimi, Mihai Surdeanu

While fully supervised relation classification (RC) models perform well on large-scale datasets, their performance drops drastically in low-resource settings.

Natural Language Inference Relation +2

Paper
Add Code

Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords

no code implementations • 14 Jul 2023 • Shahriar Golchin, Mihai Surdeanu, Nazgol Tavabi, Ata Kiapour

We propose a novel task-agnostic in-domain pre-training method that sits between generic pre-training and fine-tuning.

Paper
Add Code

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

1 code implementation • 16 Aug 2023 • Shahriar Golchin, Mihai Surdeanu

To estimate contamination of individual instances, we employ "guided instruction:" a prompt consisting of the dataset name, partition type, and the random-length initial segment of a reference instance, asking the LLM to complete it.

In-Context Learning WNLI

Paper
Code

Perturbation-based Active Learning for Question Answering

no code implementations • 4 Nov 2023 • Fan Luo, Mihai Surdeanu

Building a question answering (QA) model with less annotation costs can be achieved by utilizing active learning (AL) training strategy.

Active Learning Question Answering

Paper
Add Code

Divide & Conquer for Entailment-aware Multi-hop Evidence Retrieval

no code implementations • 5 Nov 2023 • Fan Luo, Mihai Surdeanu

However, semantic equivalence is not the only relevance signal that needs to be considered when retrieving evidences for multi-hop questions.

Information Retrieval Multi-hop Question Answering +4

Paper
Add Code

Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models

1 code implementation • 10 Nov 2023 • Shahriar Golchin, Mihai Surdeanu

We propose the Data Contamination Quiz (DCQ), a simple and effective approach to detect data contamination in large language models (LLMs) and estimate the amount of it.

Multiple-choice Sentence

Paper
Code

Enhancing Transformer RNNs with Multiple Temporal Perspectives

1 code implementation • 4 Feb 2024 • Razvan-Gabriel Dumitru, Darius Peteleaza, Mihai Surdeanu

Further, the additional parameters necessary for the multiple temporal perspectives are fine-tuned with minimal computational overhead, avoiding the need for a full pre-training.

Paper
Code

Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification

no code implementations • 5 Mar 2024 • Robert Vacareanu, Fahmida Alam, Md Asiful Islam, Haris Riaz, Mihai Surdeanu

Human interventions to the rules for the TACRED relation \texttt{org:parents} boost the performance on that relation by as much as 26\% relative improvement, without negatively impacting the other relations, and without retraining the semantic matching component.

Few-Shot Relation Classification Relation +2

Paper
Add Code

ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition

1 code implementation • 26 Mar 2024 • Haris Riaz, Razvan-Gabriel Dumitru, Mihai Surdeanu

In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data.

Language Modelling named-entity-recognition +2

Paper
Code

Towards Realistic Few-Shot Relation Extraction: A New Meta Dataset and Evaluation

no code implementations • 5 Apr 2024 • Fahmida Alam, Md Asiful Islam, Robert Vacareanu, Mihai Surdeanu

We introduce a meta dataset for few-shot relation extraction, which includes two datasets derived from existing supervised relation extraction datasets NYT29 (Takanobu et al., 2019; Nayak and Ng, 2020) and WIKIDATA (Sorokin and Gurevych, 2017) as well as a few-shot form of the TACRED dataset (Sabo et al., 2021).

Relation Relation Extraction

Paper
Add Code

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

2 code implementations • 11 Apr 2024 • Robert Vacareanu, Vlad-Andrei Negru, Vasile Suciu, Mihai Surdeanu

We analyze how well pre-trained large language models (e. g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates.

Language Modelling Large Language Model +1

152

Paper
Code

A STEP towards Interpretable Multi-Hop Reasoning:Bridge Phrase Identification and Query Expansion

no code implementations • LREC 2022 • Fan Luo, Mihai Surdeanu

Through an evaluation on HotpotQA, a popular dataset for multi-hop QA, we show that our method yields: (a) improved evidence retrieval, (b) improved QA performance when using the retrieved sentences; and (c) effective and faithful explanations when answers are provided.

Multi-hop Question Answering Question Answering +1

Paper
Add Code

Do Transformer Networks Improve the Discovery of Rules from Text?

no code implementations • LREC 2022 • Mahdi Rahimi, Mihai Surdeanu

With their Discovery of Inference Rules from Text (DIRT) algorithm, Lin and Pantel (2001) made a seminal contribution to the field of rule acquisition from text, by adapting the distributional hypothesis of Harris (1954) to rules that model binary relations such as X treat Y. DIRT’s relevance is renewed in today’s neural era given the recent focus on interpretability in the field of natural language processing.

Language Modelling Question Answering

Paper
Add Code

Low Resource Causal Event Detection from Biomedical Literature

no code implementations • BioNLP (ACL) 2022 • Zhengzhong Liang, Enrique Noriega-Atala, Clayton Morrison, Mihai Surdeanu

Recognizing causal precedence relations among the chemical interactions in biomedical literature is crucial to understanding the underlying biological mechanisms.

Event Detection Knowledge Distillation +1

Paper
Add Code

Students Who Study Together Learn Better: On the Importance of Collective Knowledge Distillation for Domain Transfer in Fact Verification

no code implementations • EMNLP 2021 • Mitch Paul Mithun, Sandeep Suntwal, Mihai Surdeanu

While neural networks produce state-of-the- art performance in several NLP tasks, they generally depend heavily on lexicalized information, which transfer poorly between domains.

Fact Verification Knowledge Distillation

Paper
Add Code

Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction

no code implementations • NAACL (HCINLP) 2022 • Mihai Surdeanu, John Hungerford, Yee Seng Chan, Jessica MacBride, Benjamin Gyori, Andrew Zupon, Zheng Tang, Haoling Qiu, Bonan Min, Yan Zverev, Caitlin Hilverman, Max Thomas, Walter Andrews, Keith Alcock, Zeyu Zhang, Michael Reynolds, Steven Bethard, Rebecca Sharp, Egoitz Laparra

An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended.

Text Summarization

Paper
Add Code

Combining Extraction and Generation for Constructing Belief-Consequence Causal Links

no code implementations • insights (ACL) 2022 • Maria Alexeeva, Allegra A. Beal, Mihai Surdeanu

In this paper, we introduce and justify a new task—causal link extraction based on beliefs—and do a qualitative analysis of the ability of a large language model—InstructGPT-3—to generate implicit consequences of beliefs.

Language Modelling Large Language Model

Paper
Add Code

Interpretability Rules: Jointly Bootstrapping a Neural Relation Extractorwith an Explanation Decoder

no code implementations • NAACL (TrustNLP) 2021 • Zheng Tang, Mihai Surdeanu

We introduce a method that transforms a rule-based relation extraction (RE) classifier into a neural one such that both interpretability and performance are achieved.

Relation Relation Extraction

Paper
Add Code

A Human-machine Interface for Few-shot Rule Synthesis for Information Extraction

no code implementations • NAACL (ACL) 2022 • Robert Vacareanu, George C.G. Barbosa, Enrique Noriega-Atala, Gus Hahn-Powell, Rebecca Sharp, Marco A. Valenzuela-Escárcega, Mihai Surdeanu

We propose a system that assists a user in constructing transparent information extraction models, consisting of patterns (or rules) written in a declarative language, through program synthesis. Users of our system can specify their requirements through the use of examples, which are collected with a search interface. The rule-synthesis system proposes rule candidates and the results of applying them on a textual corpus; the user has the option to accept the candidate, request another option, or adjust the examples provided to the system. Through an interactive evaluation, we show that our approach generates high-precision rules even in a 1-shot setting.

Relation Extraction

Paper
Add Code

PatternRank: Jointly Ranking Patterns and Extractions for Relation Extraction Using Graph-Based Algorithms

no code implementations • PANDL (COLING) 2022 • Robert Vacareanu, Dane Bell, Mihai Surdeanu

In this paper we revisit the direction of using lexico-syntactic patterns for relation extraction instead of today’s ubiquitous neural classifiers.

Relation Relation Extraction

Paper
Add Code

Do Transformers Dream of Inference, or Can Pretrained Generative Models Learn Implicit Inferential Rules?

no code implementations • EMNLP (insights) 2020 • Zhengzhong Liang, Mihai Surdeanu

Large pretrained language models (LM) have been used successfully for multi-hop question answering.

Multi-hop Question Answering Question Answering

Paper
Add Code

An Analysis of Capsule Networks for Part of Speech Tagging in High- and Low-resource Scenarios

no code implementations • EMNLP (insights) 2020 • Andrew Zupon, Faiz Rafique, Mihai Surdeanu

Neural networks are a common tool in NLP, but it is not always clear which architecture to use for a given task.

Part-Of-Speech Tagging

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.