no code implementations • 26 Oct 2023 • Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora
The paper develops a methodology for (a) designing and administering such an evaluation, and (b) automatic grading (plus spot-checking by humans) of the results using GPT-4 as well as the open LLaMA-2 70B model.
1 code implementation • 5 Nov 2022 • Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora
Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net.
no code implementations • 3 Oct 2022 • Nikunj Saunshi, Arushi Gupta, Mark Braverman, Sanjeev Arora
Influence functions estimate effect of individual data points on predictions of the model on test data and were adapted to deep learning in Koh and Liang [2017].
no code implementations • ICLR 2022 • Yi Zhang, Arushi Gupta, Nikunj Saunshi, Sanjeev Arora
Research on generalization bounds for deep networks seeks to give ways to predict test error using just the training dataset and the network parameters.
no code implementations • 29 Sep 2021 • Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora
Saliency methods seek to provide human-interpretable explanations for the output of machine learning model on a given input.
no code implementations • 29 Sep 2021 • Arushi Gupta
In this work, we show that training with SGD on ReLU neural networks gives rise to a natural set of functions for each image that are not perfectly correlated until later in training.
1 code implementation • 29 Jun 2021 • Nikunj Saunshi, Arushi Gupta, Wei Hu
An effective approach in meta-learning is to utilize multiple "train tasks" to learn a good initialization for model parameters that can help solve unseen "test tasks" with very few samples by fine-tuning from this initialization.
no code implementations • 1 Jan 2021 • Arushi Gupta
We believe the fact that higher layers may interpret weight changes made by lower layers as changes to the data may produce implicit data augmentation.
no code implementations • 26 May 2020 • Arushi Gupta
We find that this noise penalizes models that are sensitive to perturbations in the weights.
no code implementations • 25 Sep 2019 • Arushi Gupta, Sanjeev Arora
This involves computing saliency maps for all possible labels in the classification task, and using a simple competition among them to identify and remove less relevant pixels from the map.
no code implementations • 27 May 2019 • Arushi Gupta, Sanjeev Arora
There is great interest in "saliency methods" (also called "attribution methods"), which give "explanations" for a deep net's decision, by assigning a "score" to each feature/pixel in the input.
no code implementations • 4 Feb 2018 • Arushi Gupta, José Manuel Zorrilla Matilla, Daniel Hsu, Zoltán Haiman
Weak lensing maps contain information beyond two-point statistics on small scales.
no code implementations • 2 Jun 2017 • Arushi Gupta, Daniel Hsu
The underlying parameters of the model were previously shown to be identifiable from the choice probabilities for the all-products assortment, together with choice probabilities for assortments of all-but-one products.