Search Results for author: John Bohannon

Found 18 papers, 8 papers with code

How to Discern Important Urgent News?

no code implementations15 Feb 2024 Oleg Vasilyev, John Bohannon

We found that a simple property of clusters in a clustered dataset of news correlate strongly with importance and urgency of news (IUN) as assessed by LLM.

Clustering

Linear Cross-Lingual Mapping of Sentence Embeddings

1 code implementation23 May 2023 Oleg Vasilyev, Fumika Isono, John Bohannon

Semantics of a sentence is defined with much less ambiguity than semantics of a single word, and it should be better preserved by translation to another language.

Sentence Sentence Embeddings +1

Neural Embeddings for Text

1 code implementation17 Aug 2022 Oleg Vasilyev, John Bohannon

We propose a new kind of embedding for natural language text that deeply represents semantic meaning.

Language Modelling Sentence +1

BabyBear: Cheap inference triage for expensive language models

1 code implementation24 May 2022 Leila Khalili, Yao You, John Bohannon

For named entity recognition, we save 33% of the deep learning compute while maintaining an F1 score higher than 95% on the CoNLL benchmark.

Document Classification Named Entity Recognition +1

Named Entity Linking with Entity Representation by Multiple Embeddings

no code implementations21 May 2022 Oleg Vasilyev, Alex Dauenhauer, Vedant Dharnidharka, John Bohannon

Our observations suggest that the minimal number of mentions required to create a knowledge base (KB) entity is very important for NEL performance.

Entity Linking Language Modelling

Consistency and Coherence from Points of Contextual Similarity

no code implementations22 Dec 2021 Oleg Vasilyev, John Bohannon

Factual consistency is one of important summary evaluation dimensions, especially as summary generation becomes more fluent and coherent.

Namesakes: Ambiguously Named Entities from Wikipedia and News

no code implementations22 Nov 2021 Oleg Vasilyev, Aysu Altun, Nidhi Vyas, Vedant Dharnidharka, Erika Lam, John Bohannon

We present Namesakes, a dataset of ambiguously named entities obtained from English-language Wikipedia and news articles.

Entity Linking

Does Summary Evaluation Survive Translation to Other Languages?

1 code implementation NAACL 2022 Spencer Braun, Oleg Vasilyev, Neslihan Iskender, John Bohannon

The creation of a quality summarization dataset is an expensive, time-consuming effort, requiring the production and evaluation of summaries by both trained humans and machines.

Machine Translation Translation

Towards Human-Free Automatic Quality Evaluation of German Summarization

no code implementations13 May 2021 Neslihan Iskender, Oleg Vasilyev, Tim Polzehl, John Bohannon, Sebastian Möller

Evaluating large summarization corpora using humans has proven to be expensive from both the organizational and the financial perspective.

Informativeness Language Modelling

Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings

no code implementations12 Apr 2021 Oleg Vasilyev, John Bohannon

The proposed ESTIME, Estimator of Summary-to-Text Inconsistency by Mismatched Embeddings, correlates with expert scores in summary-level SummEval dataset stronger than other common evaluation measures not only in Consistency but also in Fluency.

Is human scoring the best criteria for summary evaluation?

1 code implementation Findings (ACL) 2021 Oleg Vasilyev, John Bohannon

A higher correlation with human scores is considered to be a fair indicator of a better measure.

Primer AI's Systems for Acronym Identification and Disambiguation

1 code implementation14 Dec 2020 Nicholas Egan, John Bohannon

The prevalence of ambiguous acronyms make scientific documents harder to understand for humans and machines alike, presenting a need for models that can automatically identify acronyms in text and disambiguate their meaning.

document understanding Sentence +2

Sensitivity of BLANC to human-scored qualities of text summaries

1 code implementation13 Oct 2020 Oleg Vasilyev, Vedant Dharnidharka, Nicholas Egan, Charlene Chambliss, John Bohannon

We explore the sensitivity of a document summary quality estimator, BLANC, to human assessment of qualities for the same summaries.

Zero-shot topic generation

no code implementations29 Apr 2020 Oleg Vasilyev, Kathryn Evans, Anna Venancio-Marques, John Bohannon

We present an approach to generating topics using a model trained only for document title generation, with zero examples of topics given during training.

Cannot find the paper you are looking for? You can Submit a new open access paper.