Search Results for author: Steven Bethard

Found 93 papers, 15 papers with code

UA-KO at SemEval-2022 Task 11: Data Augmentation and Ensembles for Korean Named Entity Recognition

no code implementations SemEval (NAACL) 2022 Hyunju Song, Steven Bethard

This paper presents the approaches and systems of the UA-KO team for the Korean portion of SemEval-2022 Task 11 on Multilingual Complex Named Entity Recognition. We fine-tuned Korean and multilingual BERT and RoBERTA models, conducted experiments on data augmentation, ensembles, and task-adaptive pretraining.

Data Augmentation named-entity-recognition +2

Simplifying annotation of intersections in time normalization annotation: exploring syntactic and semantic validation

no code implementations EMNLP (LAW, DMR) 2021 Peiwen Su, Steven Bethard

While annotating normalized times in food security documents, we found that the semantically compositional annotation for time normalization (SCATE) scheme required several near-duplicate annotations to get the correct semantics for expressions like Nov. 7th to 11th 2021.

Translation

Triplet-Trained Vector Space and Sieve-Based Search Improve Biomedical Concept Normalization

1 code implementation NAACL (BioNLP) 2021 Dongfang Xu, Steven Bethard

We propose a vector-space model for concept normalization, where mentions and concepts are encoded via transformer networks that are trained via a triplet objective with online hard triplet mining.

EntityBERT: Entity-centric Masking Strategy for Model Pretraining for the Clinical Domain

no code implementations NAACL (BioNLP) 2021 Chen Lin, Timothy Miller, Dmitriy Dligach, Steven Bethard, Guergana Savova

We propose a methodology to produce a model focused on the clinical domain: continued pretraining of a model with a broad representation of biomedical terminology (PubMedBERT) on a clinical corpus along with a novel entity-centric masking strategy to infuse domain knowledge in the learning process.

Negation Negation Detection +2

A Comparison of Strategies for Source-Free Domain Adaptation

1 code implementation ACL 2022 Xin Su, Yiyun Zhao, Steven Bethard

Data sharing restrictions are common in NLP, especially in the clinical domain, but there is limited research on adapting models to new domains without access to the original training data, a setting known as source-free domain adaptation.

Active Learning Data Augmentation +1

Do pretrained transformers infer telicity like humans?

no code implementations CoNLL (EMNLP) 2021 Yiyun Zhao, Jian Gang Ngui, Lucy Hall Hartley, Steven Bethard

Pretrained transformer-based language models achieve state-of-the-art performance in many NLP tasks, but it is an open question whether the knowledge acquired by the models during pretraining resembles the linguistic knowledge of humans.

Open-Ended Question Answering

Exploring Text Representations for Generative Temporal Relation Extraction

no code implementations NAACL (ClinicalNLP) 2022 Dmitriy Dligach, Steven Bethard, Timothy Miller, Guergana Savova

Sequence-to-sequence models are appealing because they allow both encoder and decoder to be shared across many tasks by formulating those tasks as text-to-text problems.

Relation Temporal Relation Extraction

Ensemble-based Fine-Tuning Strategy for Temporal Relation Extraction from the Clinical Narrative

no code implementations NAACL (ClinicalNLP) 2022 Lijing Wang, Timothy Miller, Steven Bethard, Guergana Savova

In this paper, we investigate ensemble methods for fine-tuning transformer-based pretrained models for clinical natural language processing tasks, specifically temporal relation extraction from the clinical narrative.

Relation Temporal Relation Extraction

Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction

no code implementations NAACL (HCINLP) 2022 Mihai Surdeanu, John Hungerford, Yee Seng Chan, Jessica MacBride, Benjamin Gyori, Andrew Zupon, Zheng Tang, Haoling Qiu, Bonan Min, Yan Zverev, Caitlin Hilverman, Max Thomas, Walter Andrews, Keith Alcock, Zeyu Zhang, Michael Reynolds, Steven Bethard, Rebecca Sharp, Egoitz Laparra

An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended.

Text Summarization

Domain adaptation in practice: Lessons from a real-world information extraction pipeline

no code implementations EACL (AdaptNLP) 2021 Timothy Miller, Egoitz Laparra, Steven Bethard

Advances in transfer learning and domain adaptation have raised hopes that once-challenging NLP tasks are ready to be put to use for sophisticated information extraction needs.

Domain Adaptation Link Prediction +5

Detection of Puffery on the English Wikipedia

1 code implementation WNUT (ACL) 2021 Amanda Bertsch, Steven Bethard

On Wikipedia, an online crowdsourced encyclopedia, volunteers enforce the encyclopedia’s editorial policies.

Bias Detection Information Retrieval +1

Semi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model Reasoning

no code implementations14 Nov 2023 Xin Su, Tiep Le, Steven Bethard, Phillip Howard

An important open question in the use of large language models for knowledge-intensive tasks is how to effectively integrate knowledge from three sources: the model's parametric memory, external structured knowledge, and external unstructured knowledge.

Knowledge Graphs Language Modelling +2

Fusing Temporal Graphs into Transformers for Time-Sensitive Question Answering

no code implementations30 Oct 2023 Xin Su, Phillip Howard, Nagib Hakim, Steven Bethard

Answering time-sensitive questions from long documents requires temporal reasoning over the times in questions and documents.

Question Answering Temporal Information Extraction

Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution

1 code implementation18 May 2023 Zeyu Zhang, Steven Bethard

Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics.

Information Retrieval Retrieval +1

Explainable Verbal Reasoner Plus (EVR+): A Natural Language Reasoning Framework that Supports Diverse Compositional Reasoning

1 code implementation28 Apr 2023 Zhengzhong Liang, Zeyu Zhang, Steven Bethard, Mihai Surdeanu

Languages models have been successfully applied to a variety of reasoning tasks in NLP, yet the language models still suffer from compositional generalization.

Language Modelling

We need to talk about random seeds

no code implementations24 Oct 2022 Steven Bethard

This opinion piece argues that there are some safe uses for random seeds: as part of the hyperparameter search to select a good model, creating an ensemble of several models, or measuring the sensitivity of the training algorithm to the random seed hyperparameter.

Better Retrieval May Not Lead to Better Question Answering

no code implementations7 May 2022 Zhengzhong Liang, Tushar Khot, Steven Bethard, Mihai Surdeanu, Ashish Sabharwal

Considerable progress has been made recently in open-domain question answering (QA) problems, which require Information Retrieval (IR) and Reading Comprehension (RC).

Information Retrieval Open-Domain Question Answering +3

TEAM-Atreides at SemEval-2022 Task 11: On leveraging data augmentation and ensemble to recognize complex Named Entities in Bangla

no code implementations SemEval (NAACL) 2022 Nazia Tasnim, Md. Istiak Hossain Shihab, Asif Shahriyar Sushmit, Steven Bethard, Farig Sadeque

Many areas, such as the biological and healthcare domain, artistic works, and organization names, have nested, overlapping, discontinuous entity mentions that may even be syntactically or semantically ambiguous in practice.

Data Augmentation

The University of Arizona at SemEval-2021 Task 10: Applying Self-training, Active Learning and Data Augmentation to Source-free Domain Adaptation

no code implementations SEMEVAL 2021 Xin Su, Yiyun Zhao, Steven Bethard

This paper describes our systems for negation detection and time expression recognition in SemEval 2021 Task 10, Source-Free Domain Adaptation for Semantic Processing.

Active Learning Data Augmentation +3

If You Want to Go Far Go Together: Unsupervised Joint Candidate Evidence Retrieval for Multi-hop Question Answering

no code implementations NAACL 2021 Vikas Yadav, Steven Bethard, Mihai Surdeanu

We specifically emphasize on the importance of retrieving evidence jointly by showing several comparative analyses to other methods that retrieve and rerank evidence sentences individually.

Answer Selection Multi-hop Question Answering +1

A Dataset and Evaluation Framework for Complex Geographical Description Parsing

2 code implementations COLING 2020 Egoitz Laparra, Steven Bethard

But creating a dataset for this complex geoparsing task is difficult and, if done manually, would require a huge amount of effort to annotate the geographical shapes of not only the geolocation described but also the reference toponyms.

How does BERT's attention change when you fine-tune? An analysis methodology and a case study in negation scope

no code implementations ACL 2020 Yiyun Zhao, Steven Bethard

We apply this methodology to test BERT and RoBERTa on a hypothesis that some attention heads will consistently attend from a word in negation scope to the negation cue.

Negation

A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization

no code implementations ACL 2020 Dongfang Xu, Zeyu Zhang, Steven Bethard

Concept normalization, the task of linking textual mentions of concepts to concepts in an ontology, is challenging because ontologies are large.

Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering

1 code implementation ACL 2020 Vikas Yadav, Steven Bethard, Mihai Surdeanu

Evidence retrieval is a critical stage of question answering (QA), necessary not only to improve performance, but also to explain the decisions of the corresponding QA method.

Evidence Selection Multi-hop Question Answering +2

Quick and (not so) Dirty: Unsupervised Selection of Justification Sentences for Multi-hop Question Answering

no code implementations IJCNLP 2019 Vikas Yadav, Steven Bethard, Mihai Surdeanu

We show that the sentences selected by our method improve the performance of a state-of-the-art supervised QA model on two multi-hop QA datasets: AI2's Reasoning Challenge (ARC) and Multi-Sentence Reading Comprehension (MultiRC).

Information Retrieval Multi-hop Question Answering +4

A Survey on Recent Advances in Named Entity Recognition from Deep Learning models

1 code implementation COLING 2018 Vikas Yadav, Steven Bethard

Named Entity Recognition (NER) is a key component in NLP systems for question answering, information retrieval, relation extraction, etc.

Feature Engineering Information Retrieval +6

Predicting engagement in online social networks: Challenges and opportunities

no code implementations11 Jul 2019 Farig Sadeque, Steven Bethard

We classified these works based on our task definitions, and explored the machine learning models that have been used for any kind of participation prediction.

BIG-bench Machine Learning Domain Adaptation +1

Alignment over Heterogeneous Embeddings for Question Answering

1 code implementation NAACL 2019 Vikas Yadav, Steven Bethard, Mihai Surdeanu

We propose a simple, fast, and mostly-unsupervised approach for non-factoid question answering (QA) called Alignment over Heterogeneous Embeddings (AHE).

Question Answering Sentence +1

Incivility Detection in Online Comments

no code implementations SEMEVAL 2019 Farig Sadeque, Stephen Rains, Yotam Shmargad, Kate Kenski, Kevin Coe, Steven Bethard

Incivility in public discourse has been a major concern in recent times as it can affect the quality and tenacity of the discourse negatively.

regression

Inferring missing metadata from environmental policy texts

no code implementations WS 2019 Steven Bethard, Egoitz Laparra, Sophia Wang, Yiyun Zhao, Ragheb Al-Ghezi, Aaron Lien, Laura L{\'o}pez-Hoffman

The National Environmental Policy Act (NEPA) provides a trove of data on how environmental policy decisions have been made in the United States over the last 50 years.

Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: a Detailed Analysis

no code implementations SEMEVAL 2019 Dongfang Xu, Egoitz Laparra, Steven Bethard

Recent studies have shown that pre-trained contextual word embeddings, which assign the same word different vectors in different contexts, improve performance in many tasks.

Word Embeddings

Deep Affix Features Improve Neural Named Entity Recognizers

1 code implementation SEMEVAL 2018 Vikas Yadav, Rebecca Sharp, Steven Bethard

We propose a practical model for named entity recognition (NER) that combines word and character-level information with a specific learned representation of the prefixes and suffixes of the word.

Feature Engineering Morphological Analysis +3

SemEval-2017 Task 12: Clinical TempEval

no code implementations SEMEVAL 2017 Steven Bethard, Guergana Savova, Martha Palmer, James Pustejovsky

Clinical TempEval 2017 aimed to answer the question: how well do systems trained on annotated timelines for one medical condition (colon cancer) perform in predicting timelines on another medical condition (brain cancer)?

Domain Adaptation Temporal Information Extraction

Improving Implicit Semantic Role Labeling by Predicting Semantic Frame Arguments

no code implementations IJCNLP 2017 Quynh Ngoc Thi Do, Steven Bethard, Marie-Francine Moens

Implicit semantic role labeling (iSRL) is the task of predicting the semantic roles of a predicate that do not appear as explicit arguments, but rather regard common sense knowledge or are mentioned earlier in the discourse.

Common Sense Reasoning Semantic Role Labeling

Neural Temporal Relation Extraction

no code implementations EACL 2017 Dmitriy Dligach, Timothy Miller, Chen Lin, Steven Bethard, Guergana Savova

We experiment with neural architectures for temporal relation extraction and establish a new state-of-the-art for several scenarios.

Position Relation +3

Facing the most difficult case of Semantic Role Labeling: A collaboration of word embeddings and co-training

no code implementations COLING 2016 Quynh Ngoc Thi Do, Steven Bethard, Marie-Francine Moens

We present a successful collaboration of word embeddings and co-training to tackle in the most difficult test case of semantic role labeling: predicting out-of-domain and unseen semantic frames.

Semantic Role Labeling Word Embeddings

A Semantically Compositional Annotation Scheme for Time Normalization

1 code implementation LREC 2016 Steven Bethard, Jonathan Parker

We present a new annotation scheme for normalizing time expressions, such as {``}three days ago{''}, to computer-readable forms, such as 2016-03-07.

Semantic Composition

ClearTK 2.0: Design Patterns for Machine Learning in UIMA

no code implementations LREC 2014 Steven Bethard, Philip Ogren, Lee Becker

ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models.

BIG-bench Machine Learning Chunking

Clinical TempEval

no code implementations19 Mar 2014 Steven Bethard, Leon Derczynski, James Pustejovsky, Marc Verhagen

We describe the Clinical TempEval task which is currently in preparation for the SemEval-2015 evaluation exercise.

Relation

Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence

no code implementations TACL 2014 Md. Arafat Sultan, Steven Bethard, Tamara Sumner

We present a simple, easy-to-replicate monolingual aligner that demonstrates state-of-the-art performance while relying on almost no supervision and a very small number of external resources.

Natural Language Inference Question Answering +2

Annotating Story Timelines as Temporal Dependency Structures

no code implementations LREC 2012 Steven Bethard, Oleks Kolomiyets, R, Marie-Francine Moens

We present an approach to annotating timelines in stories where events are linked together by temporal relations into a temporal dependency tree.

Reading Comprehension Relation

Cannot find the paper you are looking for? You can Submit a new open access paper.