no code implementations • 15 Feb 2024 • Oleg Vasilyev, John Bohannon
We found that a simple property of clusters in a clustered dataset of news correlate strongly with importance and urgency of news (IUN) as assessed by LLM.
1 code implementation • 23 May 2023 • Oleg Vasilyev, Fumika Isono, John Bohannon
Semantics of a sentence is defined with much less ambiguity than semantics of a single word, and it should be better preserved by translation to another language.
1 code implementation • 17 Aug 2022 • Oleg Vasilyev, John Bohannon
We propose a new kind of embedding for natural language text that deeply represents semantic meaning.
1 code implementation • 24 May 2022 • Leila Khalili, Yao You, John Bohannon
For named entity recognition, we save 33% of the deep learning compute while maintaining an F1 score higher than 95% on the CoNLL benchmark.
no code implementations • 21 May 2022 • Oleg Vasilyev, Alex Dauenhauer, Vedant Dharnidharka, John Bohannon
Our observations suggest that the minimal number of mentions required to create a knowledge base (KB) entity is very important for NEL performance.
no code implementations • 22 Dec 2021 • Oleg Vasilyev, John Bohannon
Factual consistency is one of important summary evaluation dimensions, especially as summary generation becomes more fluent and coherent.
no code implementations • 22 Nov 2021 • Oleg Vasilyev, Aysu Altun, Nidhi Vyas, Vedant Dharnidharka, Erika Lam, John Bohannon
We present Namesakes, a dataset of ambiguously named entities obtained from English-language Wikipedia and news articles.
1 code implementation • NAACL 2022 • Spencer Braun, Oleg Vasilyev, Neslihan Iskender, John Bohannon
The creation of a quality summarization dataset is an expensive, time-consuming effort, requiring the production and evaluation of summaries by both trained humans and machines.
no code implementations • 13 May 2021 • Neslihan Iskender, Oleg Vasilyev, Tim Polzehl, John Bohannon, Sebastian Möller
Evaluating large summarization corpora using humans has proven to be expensive from both the organizational and the financial perspective.
no code implementations • 12 Apr 2021 • Oleg Vasilyev, John Bohannon
The proposed ESTIME, Estimator of Summary-to-Text Inconsistency by Mismatched Embeddings, correlates with expert scores in summary-level SummEval dataset stronger than other common evaluation measures not only in Consistency but also in Fluency.
no code implementations • 19 Mar 2021 • Nicholas Egan, Oleg Vasilyev, John Bohannon
The goal of a summary is to concisely state the most important information in a document.
1 code implementation • Findings (ACL) 2021 • Oleg Vasilyev, John Bohannon
A higher correlation with human scores is considered to be a fair indicator of a better measure.
1 code implementation • 14 Dec 2020 • Nicholas Egan, John Bohannon
The prevalence of ambiguous acronyms make scientific documents harder to understand for humans and machines alike, presenting a need for models that can automatically identify acronyms in text and disambiguate their meaning.
1 code implementation • 13 Oct 2020 • Oleg Vasilyev, Vedant Dharnidharka, Nicholas Egan, Charlene Chambliss, John Bohannon
We explore the sensitivity of a document summary quality estimator, BLANC, to human assessment of qualities for the same summaries.
no code implementations • 29 Apr 2020 • Oleg Vasilyev, Kathryn Evans, Anna Venancio-Marques, John Bohannon
We present an approach to generating topics using a model trained only for document title generation, with zero examples of topics given during training.
1 code implementation • EMNLP (Eval4NLP) 2020 • Oleg Vasilyev, Vedant Dharnidharka, John Bohannon
We present BLANC, a new approach to the automatic estimation of document summary quality.
no code implementations • 17 Apr 2019 • Oleg Vasilyev, Tom Grek, John Bohannon
We propose a novel method for generating titles for unstructured text documents.