With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world.
Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context.
At training time, additional inputs based on these evaluation measures are given to the dialogue model.
To facilitate evaluation of such metrics, we introduce the Benchmark for Evaluation of Grounded INteraction (BEGIN).
Unconscious biases continue to be prevalent in modern text and media, calling for algorithms that can assist writers with bias correction.
We propose the task of outline-conditioned story generation: given an outline as a set of phrases that describe key characters and events to appear in a story, the task is to generate a coherent narrative that is consistent with the provided outline.
We introduce Social IQa, the first large-scale benchmark for commonsense reasoning about social situations.
Abductive reasoning is inference to the most plausible explanation.
We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017).
We find that best current discriminators can classify neural fake news from real, human-written, news with 73% accuracy, assuming access to a moderate level of training data.
Ranked #2 on Fake News Detection on Grover-Mega
We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations.
One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill.
We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge.
We investigate a new commonsense inference task: given an event described in a short free-form text ("X drinks coffee in the morning"), a system reasons about the likely intents ("X wants to stay awake") and reactions ("X feels alert") of the event's participants.
Ranked #1 on Common Sense Reasoning on Event2Mind dev
Understanding a narrative requires reading between the lines and reasoning about the unspoken but obvious implications about events and people's mental states - a capability that is trivial for humans but remarkably hard for machines.
Ranked #2 on Emotion Classification on ROCStories
We present an analytic study on the language of news media in the context of political fact-checking and fake news detection.
People around the globe respond to major real world events through social media.
Through a particular choice of a predicate (e. g., "x violated y"), a writer can subtly connote a range of implied sentiments and presupposed facts about the entities x and y: (1) writer's perspective: projecting x as an "antagonist"and y as a "victim", (2) entities' perspective: y probably dislikes x, (3) effect: something bad happened to y, (4) value: y is something valuable, and (5) mental state: y is distressed by the event.