Search Results for author: Valentina Pyatkin

Found 35 papers, 23 papers with code

QADiscourse - Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

no code implementations EMNLP 2020 Valentina Pyatkin, Ayal Klein, Reut Tsarfaty, Ido Dagan

Discourse relations describe how two propositions relate to one another, and identifying them automatically is an integral part of natural language understanding.

Natural Language Understanding Sentence

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training

no code implementations LREC 2022 Merel Scholman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty, Vera Demberg

The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method.


Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

1 code implementation24 Oct 2024 Lester James V. Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi

We analyze features from the routing model to identify characteristics of instances that can benefit from human feedback, e. g., prompts with a moderate safety concern or moderate intent complexity.

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

no code implementations22 Oct 2024 Jing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne G. E. Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi, Sydney Levine

The ideal LLM content moderation system would be both structurally interpretable (so its decisions can be explained to users) and steerable (to reflect a community's values or align to safety standards).

Knowledge Distillation

Diverging Preferences: When do Annotators Disagree and do Models Know?

no code implementations18 Oct 2024 Michael JQ Zhang, Zhilin Wang, Jena D. Hwang, Yi Dong, Olivier Delalleau, Yejin Choi, Eunsol Choi, Xiang Ren, Valentina Pyatkin

We find that the majority of disagreements are in opposition with standard reward modeling approaches, which are designed with the assumption that annotator disagreement is noise.

Self-Directed Synthetic Dialogues and Revisions Technical Report

no code implementations25 Jul 2024 Nathan Lambert, Hailey Schoelkopf, Aaron Gokaslan, Luca Soldaini, Valentina Pyatkin, Louis Castricato

Synthetic data has become an important tool in the fine-tuning of language models to follow instructions and solve complex problems.

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

1 code implementation7 Jun 2024 Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, Yejin Choi

For automated evaluation with WildBench, we have developed two metrics, WB-Reward and WB-Score, which are computable using advanced LLMs such as GPT-4-turbo.

Benchmarking Chatbot

Superlatives in Context: Modeling the Implicit Semantics of Superlatives

no code implementations31 May 2024 Valentina Pyatkin, Bonnie Webber, Ido Dagan, Reut Tsarfaty

Semantically, superlatives perform a set comparison: something (or some things) has the min/max property out of a set.

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

1 code implementation26 Feb 2024 Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, Dirk Hovy

Motivated by this discrepancy, we challenge the prevailing constrained evaluation paradigm for values and opinions in LLMs and explore more realistic unconstrained evaluations.


Promptly Predicting Structures: The Return of Inference

1 code implementation12 Jan 2024 Maitrey Mehta, Valentina Pyatkin, Vivek Srikumar

Can the promise of the prompt-based paradigm be extended to such structured outputs?

Structured Prediction valid

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

3 code implementations17 Nov 2023 Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques.

"You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of Abstract Meaning Representation

no code implementations26 Oct 2023 Allyson Ettinger, Jena D. Hwang, Valentina Pyatkin, Chandra Bhagavatula, Yejin Choi

We compare models' analysis of this semantic structure across two settings: 1) direct production of AMR parses based on zero- and few-shot prompts, and 2) indirect partial reconstruction of AMR via metalinguistic natural language queries (e. g., "Identify the primary event of this sentence, and the predicate corresponding to that event.").

Abstract Meaning Representation Natural Language Queries +1

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations

no code implementations24 Oct 2023 Kavel Rao, Liwei Jiang, Valentina Pyatkin, Yuling Gu, Niket Tandon, Nouha Dziri, Faeze Brahman, Yejin Choi

From this model we distill a high-quality dataset, \delta-Rules-of-Thumb, of 1. 2M entries of contextualizations and rationales for 115K defeasible moral actions rated highly by human annotators 85. 9% to 99. 8% of the time.

Diversity Imitation Learning

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

1 code implementation12 Oct 2023 Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin, Chandra Bhagavatula, Bailin Wang, Yoon Kim, Yejin Choi, Nouha Dziri, Xiang Ren

The ability to derive underlying principles from a handful of observations and then generalize to novel situations -- known as inductive reasoning -- is central to human intelligence.

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

1 code implementation2 Sep 2023 Taylor Sorensen, Liwei Jiang, Jena Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, Yejin Choi

To improve AI systems to better reflect value pluralism, the first-order challenge is to explore the extent to which AI systems can model pluralistic human values, rights, and duties as well as their interaction.

Decision Making

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

1 code implementation24 May 2023 Eran Hirsch, Valentina Pyatkin, Ruben Wolhandler, Avi Caciularu, Asi Shefer, Ido Dagan

In this paper, we suggest revisiting the sentence union generation task as an effective well-defined testbed for assessing text consolidation capabilities, decoupling the consolidation challenge from subjective content selection.

Document Summarization Long Form Question Answering +2

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

1 code implementation3 Apr 2023 Valentina Pyatkin, Frances Yung, Merel C. J. Scholman, Reut Tsarfaty, Ido Dagan, Vera Demberg

Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks.

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

1 code implementation28 Oct 2022 Yuling Gu, Yao Fu, Valentina Pyatkin, Ian Magnusson, Bhavana Dalvi Mishra, Peter Clark

We hypothesize that to perform this task well, the reader needs to mentally elaborate the scene being described to identify a sensible meaning of the language.

QASem Parsing: Text-to-text Modeling of QA-based Semantics

1 code implementation23 May 2022 Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, Ido Dagan

Several recent works have suggested to represent semantic relations with questions and answers, decomposing textual information into separate interrogative natural language statements.

Data Augmentation

Asking It All: Generating Contextualized Questions for any Semantic Role

1 code implementation EMNLP 2021 Valentina Pyatkin, Paul Roit, Julian Michael, Reut Tsarfaty, Yoav Goldberg, Ido Dagan

We develop a two-stage model for this task, which first produces a context-independent question prototype for each role and then revises it to be contextually appropriate for the passage.

Question Generation Question-Generation

The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing

2 code implementations ACL 2021 Valentina Pyatkin, Shoval Sadde, Aynat Rubinstein, Paul Portner, Reut Tsarfaty

Modality is the linguistic ability to describe events with added information such as how desirable, plausible, or feasible they are.

QANom: Question-Answer driven SRL for Nominalizations

1 code implementation COLING 2020 Ayal Klein, Jonathan Mamou, Valentina Pyatkin, Daniela Stepanov, Hangfeng He, Dan Roth, Luke Zettlemoyer, Ido Dagan

We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom.

QADiscourse -- Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

1 code implementation6 Oct 2020 Valentina Pyatkin, Ayal Klein, Reut Tsarfaty, Ido Dagan

Discourse relations describe how two propositions relate to one another, and identifying them automatically is an integral part of natural language understanding.

Natural Language Understanding Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.