Search Results for author: Naomi Saphra

Found 23 papers, 8 papers with code

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

no code implementations13 Sep 2023 Angelica Chen, Ravid Schwartz-Ziv, Kyunghyun Cho, Matthew L. Leavitt, Naomi Saphra

We further find that SAS competes with other beneficial traits and capabilities during training, and that briefly suppressing SAS can improve model quality.

Latent State Models of Training Dynamics

no code implementations18 Aug 2023 Michael Y. Hu, Angelica Chen, Naomi Saphra, Kyunghyun Cho

We use the HMM representation to study phase transitions and identify latent "detour" states that slow down convergence.

Image Classification Language Modelling +1

Dynamic Masking Rate Schedules for MLM Pretraining

no code implementations24 May 2023 Zachary Ankner, Naomi Saphra, Davis Blalock, Jonathan Frankle, Matthew L. Leavitt

Most works on transformers trained with the Masked Language Modeling (MLM) objective use the original BERT model's fixed masking rate of 15%.

Language Modelling Masked Language Modeling +1

One Venue, Two Conferences: The Separation of Chinese and American Citation Networks

no code implementations17 Nov 2022 Bingchen Zhao, Yuling Gu, Jessica Zosa Forde, Naomi Saphra

At NeurIPS, American and Chinese institutions cite papers from each other's regions substantially less than they cite endogamously.

Linear Connectivity Reveals Generalization Strategies

1 code implementation24 May 2022 Jeevesh Juneja, Rachit Bansal, Kyunghyun Cho, João Sedoc, Naomi Saphra

It is widely accepted in the mode connectivity literature that when two neural networks are trained similarly on the same data, they are connected by a path through parameter space over which test set accuracy is maintained.


A Non-Linear Structural Probe

no code implementations NAACL 2021 Jennifer C. White, Tiago Pimentel, Naomi Saphra, Ryan Cotterell

Probes are models devised to investigate the encoding of knowledge -- e. g. syntactic structure -- in contextual representations.

LSTMs Compose---and Learn---Bottom-Up

no code implementations Findings of the Association for Computational Linguistics 2020 Naomi Saphra, Adam Lopez

To explore the inductive biases that cause these compositional representations to arise during training, we conduct simple experiments on synthetic data.

LSTMs Compose (and Learn) Bottom-Up

no code implementations6 Oct 2020 Naomi Saphra, Adam Lopez

To explore the inductive biases that cause these compositional representations to arise during training, we conduct simple experiments on synthetic data.

Pareto Probing: Trading Off Accuracy for Complexity

1 code implementation EMNLP 2020 Tiago Pimentel, Naomi Saphra, Adina Williams, Ryan Cotterell

In our contribution to this discussion, we argue for a probe metric that reflects the fundamental trade-off between probe complexity and performance: the Pareto hypervolume.

Dependency Parsing

Word Interdependence Exposes How LSTMs Compose Representations

no code implementations27 Apr 2020 Naomi Saphra, Adam Lopez

Recent work in NLP shows that LSTM language models capture compositional structure in language data.

How to Evaluate Word Representations of Informal Domain?

1 code implementation12 Nov 2019 Yekun Chai, Naomi Saphra, Adam Lopez

Diverse word representations have surged in most state-of-the-art natural language processing (NLP) applications.

Word Embeddings

Sparsity Emerges Naturally in Neural Language Models

no code implementations ICML Workshop Deep_Phenomen 2019 Naomi Saphra, Adam Lopez

Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks.

Do LSTMs Learn Compositionally?

no code implementations28 May 2019 Naomi Saphra, Adam Lopez

LSTM-based language models exhibit compositionality in their representations, but how this behavior emerges over the course of training has not been explored.

Understanding Learning Dynamics Of Language Models with SVCCA

no code implementations NAACL 2019 Naomi Saphra, Adam Lopez

Research has shown that neural models implicitly encode linguistic features, but there has been no research showing \emph{how} these encodings arise as the models are trained.

Language Modelling

DyNet: The Dynamic Neural Network Toolkit

4 code implementations15 Jan 2017 Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

Evaluating Informal-Domain Word Representations With UrbanDictionary

1 code implementation WS 2016 Naomi Saphra, Adam Lopez

Existing corpora for intrinsic evaluation are not targeted towards tasks in informal domains such as Twitter or news comment forums.

Understanding Objects in Detail with Fine-Grained Attributes

no code implementations CVPR 2014 Andrea Vedaldi, Siddharth Mahendran, Stavros Tsogkas, Subhransu Maji, Ross Girshick, Juho Kannala, Esa Rahtu, Iasonas Kokkinos, Matthew B. Blaschko, David Weiss, Ben Taskar, Karen Simonyan, Naomi Saphra, Sammy Mohamed

We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object.

object-detection Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.