Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive.
Applications that could benefit from automatic understanding of human-human conversations often come with challenges associated with private information in real-world data such as call center or clinical conversations.
This paper explores methods for extracting information from radiology reports that generalize across exam modalities to reduce requirements for annotated data.
We introduce Fine-Grained RLHF, a framework that enables training and learning from reward functions that are fine-grained in two respects: (1) density, providing a reward after every segment (e. g., a sentence) is generated; and (2) incorporating multiple reward models associated with different feedback types (e. g., factual incorrectness, irrelevance, and information incompleteness).
We introduce TIFA (Text-to-Image Faithfulness evaluation with question Answering), an automatic evaluation metric that measures the faithfulness of a generated image to its text input via visual question answering (VQA).
Our analysis suggests that INSTRUCTOR is robust to changes in instructions, and that instruction finetuning mitigates the challenge of training a single model on diverse datasets.
We propose Binder, a training-free neural-symbolic framework that maps the task input to a program, which (1) allows binding a unified API of language model (LM) functionalities to a programming language (e. g., SQL, Python) to extend its grammar coverage and thus tackle more diverse questions, (2) adopts an LM as both the program parser and the underlying model called by the API during execution, and (3) requires only a few in-context exemplar annotations.
Ranked #3 on Semantic Parsing on WikiTableQuestions
To reduce reliance on domain-specific features, we propose a domain generalization method that dynamically masks frequent symptoms words in the source domain.
Departing from recent in-context learning methods, we formulate an annotation-efficient, two-step framework: selective annotation that chooses a pool of examples to annotate from unlabeled data in advance, followed by prompt retrieval that retrieves task examples from the annotated pool at test time.
In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable.
In this paper, we explore automatic prediction of dialect density of the African American English (AAE) dialect, where dialect density is defined as the percentage of words in an utterance that contain characteristics of the non-standard dialect.
In this work, we propose an in-context learning (ICL) framework for zero-shot and few-shot learning DST, where a large pre-trained language model (LM) takes a test instance and a few exemplars as input, and directly decodes the dialogue state without any parameter updates.
Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context.
Task-oriented conversational systems often use dialogue state tracking to represent the user's intentions, which involves filling in values of pre-defined slots.
Ranked #1 on Dialogue State Tracking on MULTIWOZ 2.1 (MultiWOZ (Joint Goal Acc) metric)
Identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation.
This work explores constituency parsing on automatically recognized transcripts of conversational speech.
Tables in Web documents are pervasive and can be directly used to answer many of the queries searched on the Web, motivating their integration in question answering.
In a secondary use application, we explored the prediction of COVID-19 test results using structured patient data (e. g. vital signs and laboratory results) and automatically extracted symptom information.
The differences in written text and conversational speech are substantial; previous parsers trained on treebanked text have given very poor results on spontaneous speech.
Knowledge graphs capture entities and relations from long documents and can facilitate reasoning in many downstream applications.
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses.
The Social History Annotation Corpus (SHAC) includes 4, 480 social history sections with detailed annotation for 12 SDOH characterizing the status, extent, and temporal information of 18K distinct events.
Automated essay scoring systems typically rely on hand-crafted features to predict essay quality, but such systems are limited by the cost of feature engineering.
This paper explores contexts associated with errors in transcrip-tion of spontaneous speech, shedding light on human perceptionof disfluencies and other conversational speech phenomena.
We introduce a general framework for several information extraction tasks that share span representations using dynamically constructed span graphs.
Ranked #1 on Relation Extraction on ACE 2004 (Cross Sentence metric)
In this paper we introduce a novel pattern match neural network architecture that uses neighbor similarity scores as features, eliminating the need for feature engineering in a disfluency detection task.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles.
Ranked #8 on Joint Entity and Relation Extraction on SciERC
This paper describes our submission for the SemEval 2018 Task 7 shared task on semantic relation extraction and classification in scientific papers.
In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem.
This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models.
This paper describes our submission for SemEval 2018 Task 7 shared task on semantic relation extraction and classification in scientific papers.
Evaluation of text difficulty is important both for downstream tasks like text simplification, and for supporting educators in classrooms.
We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize.
This paper addresses the problem of community membership detection using only text features in a scenario where a small number of positive labeled examples defines the community.
This paper addresses the problem of predicting duration of unplanned power outages, using historical outage records to train a series of neural network predictors.
We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums.
Knowledge of the association between assessment questions and the skills required to solve them is necessary for analysis of student learning.
This paper addresses the problem of extracting keyphrases from scientific articles and categorizing them as corresponding to a task, process, or material.
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses.
This paper addresses the problem of predicting popularity of comments in an online discussion forum using reinforcement learning, particularly addressing two challenges that arise from having natural language state and action spaces.
This paper presents a novel approach for modeling threaded discussions on social media using a graph-structured bidirectional LSTM which represents both hierarchical and temporal conversation structure.
This work investigates style and topic aspects of language in online communities: looking at both utility as an identifier of the community and correlation with community reception of content.
Many social media platforms offer a mechanism for readers to react to comments, both positively and negatively, which in aggregate can be thought of as community endorsement.
Social media messages' brevity and unconventional spelling pose a challenge to language identification.
We introduce an online popularity prediction and tracking task as a benchmark task for reinforcement learning with a combinatorial, natural language action space.
The goal of this paper is to use multi-task learning to efficiently scale slot filling models for natural language understanding to handle multiple target tasks or domains.
This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games.
This paper addresses the question of how language use affects community reaction to comments in online discussion forums, and the relative importance of the message vs. the messenger.