1 code implementation • 6 Mar 2024 • Adithya Bhaskar, Dan Friedman, Danqi Chen
Instead of finding competing subnetworks, we find that all subnetworks -- whether they generalize or not -- share a set of attention heads, which we refer to as the heuristic core.
no code implementations • 6 Dec 2023 • Dan Friedman, Andrew Lampinen, Lucas Dixon, Danqi Chen, Asma Ghandeharioun
A common method to study deep learning systems is to use simplified model representations -- for example, using singular value decomposition to visualize the model's hidden states in a lower dimensional space.
no code implementations • 24 Sep 2023 • R. Thomas McCoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths
This approach - which we call the teleological approach - leads us to identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input.
1 code implementation • 22 May 2023 • Chenglei Si, Dan Friedman, Nitish Joshi, Shi Feng, Danqi Chen, He He
We investigate the inductive biases of ICL from the perspective of feature bias: which feature ICL is more likely to use given a set of underspecified demonstrations in which two features are equally predictive of the labels.
1 code implementation • 20 Oct 2022 • Dan Friedman, Alexander Wettig, Danqi Chen
Many NLP datasets have been found to contain shortcuts: simple decision rules that achieve surprisingly high accuracy.
1 code implementation • 5 Oct 2022 • Dan Friedman, Adji Bousso Dieng
Importantly, unlike many existing metrics in ML, the Vendi Score does not require a reference dataset or distribution over samples or labels, it is therefore general and applicable to any generative model, decoding algorithm, and dataset from any domain where similarity can be defined.
1 code implementation • EMNLP 2021 • Dan Friedman, Ben Dodge, Danqi Chen
Many datasets have been created for training reading comprehension models, and a natural question is whether we can combine them to build models that (1) perform better on all of the training datasets and (2) generalize and transfer better to new datasets.
2 code implementations • NAACL 2021 • Zexuan Zhong, Dan Friedman, Danqi Chen
Petroni et al. (2019) demonstrated that it is possible to retrieve world facts from a pre-trained language model by expressing them as cloze-style prompts and interpret the model's prediction accuracy as a lower bound on the amount of factual information it encodes.
1 code implementation • 4 Sep 2019 • Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R. Fabbri, Irene Li, Dan Friedman, Dragomir R. Radev
Scientific article summarization is challenging: large, annotated corpora are not available, and the summary should ideally include the article's impacts on research community.
Ranked #1 on Scientific Document Summarization on CL-SciSumm
1 code implementation • NAACL 2019 • Jungo Kasai, Dan Friedman, Robert Frank, Dragomir Radev, Owen Rambow
We introduce a new syntax-aware model for dependency-based semantic role labeling that outperforms syntax-agnostic models for English and Spanish.
no code implementations • ICLR 2018 • Yoel Zeldes, Stavros Theodorakis, Efrat Solodnik, Aviv Rotman, Gil Chamiel, Dan Friedman
Building robust online content recommendation systems requires learning complex interactions between user preferences and content features.