Search Results for author: Lily Zhang

Found 4 papers, 2 papers with code

Don't blame Dataset Shift! Shortcut Learning due to Gradients and Cross Entropy

no code implementations24 Aug 2023 Aahlad Puli, Lily Zhang, Yoav Wald, Rajesh Ranganath

However, even when the stable feature determines the label in the training distribution and the shortcut does not provide any additional information, like in perception tasks, default-ERM still exhibits shortcut learning.

Inductive Bias

When More is Less: Incorporating Additional Datasets Can Hurt Performance By Introducing Spurious Correlations

1 code implementation8 Aug 2023 Rhys Compton, Lily Zhang, Aahlad Puli, Rajesh Ranganath

In machine learning, incorporating more data is often seen as a reliable strategy for improving model performance; this work challenges that notion by demonstrating that the addition of external datasets in many cases can hurt the resulting model's performance.

Doc2Dict: Information Extraction as Text Generation

1 code implementation16 May 2021 Benjamin Townsend, Eamon Ito-Fisher, Lily Zhang, Madison May

Typically, information extraction (IE) requires a pipeline approach: first, a sequence labeling model is trained on manually annotated documents to extract relevant spans; then, when a new document arrives, a model predicts spans which are then post-processed and standardized to convert the information into a database entry.

Language Modelling Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.