Search Results for author: Roy Schwartz

Found 61 papers, 28 papers with code

Expected Validation Performance and Estimation of a Random Variable’s Maximum

no code implementations Findings (EMNLP) 2021 Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith

We find that the two biased estimators lead to the fewest incorrect conclusions, which hints at the importance of minimizing variance and MSE.

Beyond Performance: Quantifying and Mitigating Label Bias in LLMs

1 code implementation4 May 2024 Yuval Reif, Roy Schwartz

Our results emphasize that label bias in the predictions of LLMs remains a barrier to their reliability.

The Larger the Better? Improved LLM Code-Generation via Budget Reallocation

no code implementations31 Mar 2024 Michael Hassid, Tal Remez, Jonas Gehring, Roy Schwartz, Yossi Adi

On the other hand, in scenarios where unit-tests are unavailable, a ranking-based selection of candidates from the smaller model falls short of the performance of a single output from larger ones.

Code Generation

Transformers are Multi-State RNNs

1 code implementation11 Jan 2024 Matanel Oren, Michael Hassid, Nir Yarden, Yossi Adi, Roy Schwartz

Our results shed light on the connection between transformers and RNNs, and help mitigate one of LLMs' most painful computational bottlenecks - the size of their key-value cache.

Decoder

Read, Look or Listen? What's Needed for Solving a Multimodal Dataset

no code implementations6 Jul 2023 Netta Madvil, Yonatan Bitton, Roy Schwartz

We propose a two-step method to analyze multimodal datasets, which leverages a small seed of human annotation to map each multimodal instance to the modalities required to process it.

Question Answering Speaker Identification +1

Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research

no code implementations29 Jun 2023 Ji-Ung Lee, Haritz Puerto, Betty van Aken, Yuki Arase, Jessica Zosa Forde, Leon Derczynski, Andreas Rücklé, Iryna Gurevych, Roy Schwartz, Emma Strubell, Jesse Dodge

Many recent improvements in NLP stem from the development and use of large pre-trained language models (PLMs) with billions of parameters.

Morphosyntactic probing of multilingual BERT models

1 code implementation9 Jun 2023 Judit Acs, Endre Hamerlik, Roy Schwartz, Noah A. Smith, Andras Kornai

We introduce an extensive dataset for multilingual probing of morphological information in language models (247 tasks across 42 languages from 10 families), each consisting of a sentence with a target word and a morphological tag as the desired label, derived from the Universal Dependencies treebanks.

Sentence TAG

Fighting Bias with Bias: Promoting Model Robustness by Amplifying Dataset Biases

1 code implementation30 May 2023 Yuval Reif, Roy Schwartz

We suggest that in order to drive the development of models robust to subtle biases, dataset biases should be amplified in the training set.

VASR: Visual Analogies of Situation Recognition

1 code implementation8 Dec 2022 Yonatan Bitton, Ron Yosef, Eli Strugo, Dafna Shahaf, Roy Schwartz, Gabriel Stanovsky

We leverage situation recognition annotations and the CLIP model to generate a large set of 500k candidate analogies.

Common Sense Reasoning Visual Analogies +1

How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers

1 code implementation7 Nov 2022 Michael Hassid, Hao Peng, Daniel Rotem, Jungo Kasai, Ivan Montero, Noah A. Smith, Roy Schwartz

Our results motivate research on simpler alternatives to input-dependent attention, as well as on methods for better utilization of this mechanism in the Transformer architecture.

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

1 code implementation25 Jul 2022 Yonatan Bitton, Nitzan Bitton Guetta, Ron Yosef, Yuval Elovici, Mohit Bansal, Gabriel Stanovsky, Roy Schwartz

While vision-and-language models perform well on tasks such as visual question answering, they struggle when it comes to basic human commonsense reasoning skills.

Common Sense Reasoning General Knowledge +4

Fewer Errors, but More Stereotypes? The Effect of Model Size on Gender Bias

1 code implementation NAACL (GeBNLP) 2022 Yarden Tal, Inbal Magar, Roy Schwartz

We find that while larger models outperform smaller ones, the probability that their mistakes are caused by gender bias is higher.

Language Modelling Memorization

Measuring the Carbon Intensity of AI in Cloud Instances

no code implementations10 Jun 2022 Jesse Dodge, Taylor Prewitt, Remi Tachet des Combes, Erika Odmark, Roy Schwartz, Emma Strubell, Alexandra Sasha Luccioni, Noah A. Smith, Nicole DeCario, Will Buchanan

By providing unprecedented access to computational resources, cloud computing has enabled rapid growth in technologies such as machine learning, the computational demands of which incur a high energy cost and a commensurate carbon footprint.

Cloud Computing Language Modelling

On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations

no code implementations Findings (NAACL) 2022 Roy Schwartz, Gabriel Stanovsky

Recent work has shown that deep learning models in NLP are highly sensitive to low-level correlations between simple features and specific output labels, leading to overfitting and lack of generalization.

Common Sense Reasoning World Knowledge

TangoBERT: Reducing Inference Cost by using Cascaded Architecture

no code implementations13 Apr 2022 Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Roy Schwartz

In order to reduce this computational load in inference time, we present TangoBERT, a cascaded model architecture in which instances are first processed by an efficient but less accurate first tier model, and only part of those instances are additionally processed by a less efficient but more accurate second tier model.

Reading Comprehension SST-2 +2

A deep learning framework for the detection and quantification of drusen and reticular pseudodrusen on optical coherence tomography

no code implementations5 Apr 2022 Roy Schwartz, Hagar Khalid, Sandra Liakopoulos, Yanling Ouyang, Coen de Vente, Cristina González-Gonzalo, Aaron Y. Lee, Robyn Guymer, Emily Y. Chew, Catherine Egan, Zhichao Wu, Himeesh Kumar, Joseph Farrington, Clara I. Sánchez, Adnan Tufail

Methods - A DL framework was developed consisting of a classification model and an out-of-distribution (OOD) detection model for the identification of ungradable scans; a classification model to identify scans with drusen or RPD; and an image segmentation model to independently segment lesions as RPD or drusen.

Classification Image Segmentation +4

Data Contamination: From Memorization to Exploitation

1 code implementation ACL 2022 Inbal Magar, Roy Schwartz

Experiments with two models and three downstream tasks show that exploitation exists in some cases, but in others the models memorize the contaminated data, but do not exploit it.

Memorization

Expected Validation Performance and Estimation of a Random Variable's Maximum

no code implementations1 Oct 2021 Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith

We find that the two biased estimators lead to the fewest incorrect conclusions, which hints at the importance of minimizing variance and MSE.

Data Efficient Masked Language Modeling for Vision and Language

1 code implementation Findings (EMNLP) 2021 Yonatan Bitton, Gabriel Stanovsky, Michael Elhadad, Roy Schwartz

We investigate a range of alternative masking strategies specific to the cross-modal setting that address these shortcomings, aiming for better fusion of text and image in the learned representation.

Language Modelling Masked Language Modeling +1

Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?

no code implementations22 Apr 2021 William Merrill, Yoav Goldberg, Roy Schwartz, Noah A. Smith

We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

2 code implementations NAACL 2021 Yonatan Bitton, Gabriel Stanovsky, Roy Schwartz, Michael Elhadad

Recent works have shown that supervised models often exploit data artifacts to achieve good test scores while their performance severely degrades on samples outside their training distribution.

Question Answering Relational Reasoning +1

Random Feature Attention

no code implementations ICLR 2021 Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah A. Smith, Lingpeng Kong

RFA can be used as a drop-in replacement for conventional softmax attention and offers a straightforward way of learning with recency bias through an optional gating mechanism.

Language Modelling Machine Translation +3

A Refined Analysis of Submodular Greedy

no code implementations25 Feb 2021 Ariel Kulik, Roy Schwartz, Hadas Shachnai

Many algorithms for maximizing a monotone submodular function subject to a knapsack constraint rely on the natural greedy heuristic.

Data Structures and Algorithms

Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent

1 code implementation EMNLP 2021 William Merrill, Vivek Ramanujan, Yoav Goldberg, Roy Schwartz, Noah Smith

To better understand this bias, we study the tendency for transformer parameters to grow in magnitude ($\ell_2$ norm) during training, and its implications for the emergent representations within self attention layers.

Inductive Bias

Extracting a Knowledge Base of Mechanisms from COVID-19 Papers

3 code implementations NAACL 2021 Tom Hope, Aida Amini, David Wadden, Madeleine van Zuylen, Sravanthi Parasa, Eric Horvitz, Daniel Weld, Roy Schwartz, Hannaneh Hajishirzi

The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge.

Navigate

A Mixture of h - 1 Heads is Better than h Heads

no code implementations ACL 2020 Hao Peng, Roy Schwartz, Dianqi Li, Noah A. Smith

Multi-head attentive neural architectures have achieved state-of-the-art results on a variety of natural language processing tasks.

Language Modelling Machine Translation +1

A Mixture of $h-1$ Heads is Better than $h$ Heads

no code implementations13 May 2020 Hao Peng, Roy Schwartz, Dianqi Li, Noah A. Smith

Multi-head attentive neural architectures have achieved state-of-the-art results on a variety of natural language processing tasks.

Language Modelling Machine Translation +1

A Formal Hierarchy of RNN Architectures

no code implementations ACL 2020 William Merrill, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav

While formally extending these findings to unsaturated RNNs is left to future work, we hypothesize that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy.

The Right Tool for the Job: Matching Model and Instance Complexities

1 code implementation ACL 2020 Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, Noah A. Smith

Our method presents a favorable speed/accuracy tradeoff in almost all cases, producing models which are up to five times faster than the state of the art, while preserving their accuracy.

Natural Language Inference text-classification +1

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

4 code implementations15 Feb 2020 Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, Noah Smith

We publicly release all of our experimental data, including training and validation scores for 2, 100 trials, to encourage further analysis of training dynamics during fine-tuning.

Fair Correlation Clustering

no code implementations10 Feb 2020 Saba Ahmadi, Sainyam Galhotra, Barna Saha, Roy Schwartz

We consider two variations of fairness constraint for the problem of correlation clustering where each node has a color, and the goal is to form clusters that do not over-represent vertices of any color.

Clustering Fairness

Knowledge Enhanced Contextual Word Representations

1 code implementation IJCNLP 2019 Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith

Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities.

Entity Linking Entity Typing +3

Show Your Work: Improved Reporting of Experimental Results

4 code implementations IJCNLP 2019 Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith

Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e. g., accuracy) on held-out test data, compared to previous results.

Green AI

2 code implementations22 Jul 2019 Roy Schwartz, Jesse Dodge, Noah A. Smith, Oren Etzioni

Moreover, the financial cost of the computations can make it difficult for academics, students, and researchers, in particular those from emerging economies, to engage in deep learning research.

Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets

no code implementations NAACL 2019 Nelson F. Liu, Roy Schwartz, Noah A. Smith

Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks.

Rational Recurrences

1 code implementation EMNLP 2018 Hao Peng, Roy Schwartz, Sam Thomson, Noah A. Smith

We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs.

Language Modelling text-classification +1

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

1 code implementation EMNLP 2018 Rowan Zellers, Yonatan Bisk, Roy Schwartz, Yejin Choi

Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine").

Common Sense Reasoning Multiple-choice +2

Bridging CNNs, RNNs, and Weighted Finite-State Machines

no code implementations ACL 2018 Roy Schwartz, Sam Thomson, Noah A. Smith

Recurrent and convolutional neural networks comprise two distinct families of models that have proven to be useful for encoding natural language utterances.

General Classification Representation Learning +3

LSTMs Exploit Linguistic Attributes of Data

no code implementations WS 2018 Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith

While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data.

Memorization Open-Ended Question Answering

SoPa: Bridging CNNs, RNNs, and Weighted Finite-State Machines

2 code implementations15 May 2018 Roy Schwartz, Sam Thomson, Noah A. Smith

Recurrent and convolutional neural networks comprise two distinct families of models that have proven to be useful for encoding natural language utterances.

Explainable artificial intelligence General Classification +3

A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications

1 code implementation NAACL 2018 Dongyeop Kang, Waleed Ammar, Bhavana Dalvi, Madeleine van Zuylen, Sebastian Kohlmeier, Eduard Hovy, Roy Schwartz

In the first task, we show that simple models can predict whether a paper is accepted with up to 21% error reduction compared to the majority baseline.

Annotation Artifacts in Natural Language Inference Data

no code implementations NAACL 2018 Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith

Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to.

Natural Language Inference Negation +2

Story Cloze Task: UW NLP System

no code implementations WS 2017 Roy Schwartz, Maarten Sap, Ioannis Konstas, Leila Zilles, Yejin Choi, Noah A. Smith

This paper describes University of Washington NLP{'}s submission for the Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem 2017) shared task{---}the Story Cloze Task.

Language Modelling

Automatic Selection of Context Configurations for Improved Class-Specific Word Representations

no code implementations CONLL 2017 Ivan Vulić, Roy Schwartz, Ari Rappoport, Roi Reichart, Anna Korhonen

With our selected context configurations, we train on only 14% (A), 26. 2% (V), and 33. 6% (N) of all dependency-based contexts, resulting in a reduced training time.

Word Similarity

Cannot find the paper you are looking for? You can Submit a new open access paper.