Search Results for author: Chiyuan Zhang

Found 39 papers, 18 papers with code

Localizing Paragraph Memorization in Language Models

1 code implementation28 Mar 2024 Niklas Stoehr, Mitchell Gordon, Chiyuan Zhang, Owen Lewis

Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data?

Language Modelling Memorization

How Private are DP-SGD Implementations?

no code implementations26 Mar 2024 Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

We demonstrate a substantial gap between the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: (i) Shuffling, and (ii) Poisson subsampling; the typical analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) follows by interpreting it as a post-processing of ABLQ.

Training Differentially Private Ad Prediction Models with Semi-Sensitive Features

no code implementations26 Jan 2024 Lynn Chua, Qiliang Cui, Badih Ghazi, Charlie Harrison, Pritish Kamath, Walid Krichene, Ravi Kumar, Pasin Manurangsi, Krishna Giri Narra, Amer Sinha, Avinash Varadarajan, Chiyuan Zhang

Motivated by problems arising in digital advertising, we introduce the task of training differentially private (DP) machine learning models with semi-sensitive features.

Can Neural Network Memorization Be Localized?

1 code implementation18 Jul 2023 Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang

Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model.


Ticketed Learning-Unlearning Schemes

no code implementations27 Jun 2023 Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Ayush Sekhari, Chiyuan Zhang

Subsequently, given any subset of examples that wish to be unlearnt, the goal is to learn, without the knowledge of the original training dataset, a good predictor that is identical to the predictor that would have been produced when learning from scratch on the surviving examples.

On User-Level Private Convex Optimization

no code implementations8 May 2023 Badih Ghazi, Pritish Kamath, Ravi Kumar, Raghu Meka, Pasin Manurangsi, Chiyuan Zhang

We introduce a new mechanism for stochastic convex optimization (SCO) with user-level differential privacy guarantees.

Regression with Label Differential Privacy

no code implementations12 Dec 2022 Badih Ghazi, Pritish Kamath, Ravi Kumar, Ethan Leeman, Pasin Manurangsi, Avinash V Varadarajan, Chiyuan Zhang

We study the task of training regression models with the guarantee of label differential privacy (DP).


Private Ad Modeling with DP-SGD

no code implementations21 Nov 2022 Carson Denison, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Krishna Giri Narra, Amer Sinha, Avinash V Varadarajan, Chiyuan Zhang

A well-known algorithm in privacy-preserving ML is differentially private stochastic gradient descent (DP-SGD).

Privacy Preserving

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

no code implementations31 Oct 2022 Daphne Ippolito, Florian Tramèr, Milad Nasr, Chiyuan Zhang, Matthew Jagielski, Katherine Lee, Christopher A. Choquette-Choo, Nicholas Carlini

Studying data memorization in neural language models helps us understand the risks (e. g., to privacy or copyright) associated with models regurgitating training data and aids in the development of countermeasures.

Memorization Open-Ended Question Answering +1

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

1 code implementation15 Apr 2022 Weiyan Shi, Ryan Shea, Si Chen, Chiyuan Zhang, Ruoxi Jia, Zhou Yu

Utilizing the fact that sensitive information in language data tends to be sparse, Shi et al. (2021) formalized a DP notion extension called Selective Differential Privacy (SDP) to protect only the sensitive tokens defined by a policy function.

Quantifying Memorization Across Neural Language Models

2 code implementations15 Feb 2022 Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, Chiyuan Zhang

Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim.

Fairness Memorization

Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation

no code implementations15 Oct 2021 Yao Qin, Chiyuan Zhang, Ting Chen, Balaji Lakshminarayanan, Alex Beutel, Xuezhi Wang

We show that patch-based negative augmentation consistently improves robustness of ViTs across a wide set of ImageNet based robustness benchmarks.

Data Augmentation

Do Vision Transformers See Like Convolutional Neural Networks?

4 code implementations NeurIPS 2021 Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, Alexey Dosovitskiy

Finally, we study the effect of (pretraining) dataset scale on intermediate features and transfer learning, and conclude with a discussion on connections to new architectures such as the MLP-Mixer.

Classification Image Classification +1

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

2 code implementations27 Jul 2021 Chiyuan Zhang, Maithra Raghu, Jon Kleinberg, Samy Bengio

In PVR, this is done by having one part of the task input act as a pointer, giving instructions on a different input location, which forms the output.

Memorization Retrieval

Deduplicating Training Data Makes Language Models Better

1 code implementation ACL 2022 Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch, Nicholas Carlini

As a result, over 1% of the unprompted output of language models trained on these datasets is copied verbatim from the training data.

Language Modelling Sentence

Understanding Invariance via Feedforward Inversion of Discriminatively Trained Classifiers

no code implementations15 Mar 2021 Piotr Teterwak, Chiyuan Zhang, Dilip Krishnan, Michael C. Mozer

We use our reconstruction model as a tool for exploring the nature of representations, including: the influence of model architecture and training objectives (specifically robust losses), the forms of invariance that networks achieve, representational differences between correctly and incorrectly classified images, and the effects of manipulating logits and images.

Deep Learning with Label Differential Privacy

no code implementations NeurIPS 2021 Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, Chiyuan Zhang

The Randomized Response (RR) algorithm is a classical technique to improve robustness in survey aggregation, and has been widely adopted in applications with differential privacy guarantees.

Multi-class Classification

What is being transferred in transfer learning?

1 code implementation NeurIPS 2020 Behnam Neyshabur, Hanie Sedghi, Chiyuan Zhang

One desired capability for machines is the ability to transfer their knowledge of one domain to another where data is (usually) scarce.

Transfer Learning

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

no code implementations NeurIPS 2020 Vitaly Feldman, Chiyuan Zhang

First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples.


Characterizing Structural Regularities of Labeled Data in Overparameterized Models

1 code implementation8 Feb 2020 Ziheng Jiang, Chiyuan Zhang, Kunal Talwar, Michael C. Mozer

We obtain empirical estimates of this score for individual instances in multiple data sets, and we show that the score identifies out-of-distribution and mislabeled examples at one end of the continuum and strongly regular examples at the other end.

Density Estimation Out-of-Distribution Detection +1

Transfusion: Understanding Transfer Learning for Medical Imaging

2 code implementations NeurIPS 2019 Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, Samy Bengio

Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse.

Image Classification Transfer Learning

Identity Crisis: Memorization and Generalization under Extreme Overparameterization

no code implementations ICLR 2020 Chiyuan Zhang, Samy Bengio, Moritz Hardt, Michael C. Mozer, Yoram Singer

We study the interplay between memorization and generalization of overparameterized networks in the extreme case of a single training example and an identity-mapping task.


Are All Layers Created Equal?

2 code implementations ICML Workshop Deep_Phenomen 2019 Chiyuan Zhang, Samy Bengio, Yoram Singer

Morally, layers of large deep neural networks can be categorized as either "robust" or "critical".

Unrestricted Adversarial Examples

1 code implementation22 Sep 2018 Tom B. Brown, Nicholas Carlini, Chiyuan Zhang, Catherine Olsson, Paul Christiano, Ian Goodfellow

We introduce a two-player contest for evaluating the safety and robustness of machine learning systems, with a large prize pool.

BIG-bench Machine Learning

A Study on Overfitting in Deep Reinforcement Learning

1 code implementation18 Apr 2018 Chiyuan Zhang, Oriol Vinyals, Remi Munos, Samy Bengio

We conclude with a general discussion on overfitting in RL and a study of the generalization behaviors from the perspective of inductive bias.

Inductive Bias reinforcement-learning +1

Machine Theory of Mind

no code implementations ICML 2018 Neil C. Rabinowitz, Frank Perbet, H. Francis Song, Chiyuan Zhang, S. M. Ali Eslami, Matthew Botvinick

We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone.


Theory of Deep Learning IIb: Optimization Properties of SGD

no code implementations7 Jan 2018 Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio

In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.

Understanding deep learning requires rethinking generalization

7 code implementations10 Nov 2016 Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance.

Image Classification

Training Deep Nets with Sublinear Memory Cost

6 code implementations21 Apr 2016 Tianqi Chen, Bing Xu, Chiyuan Zhang, Carlos Guestrin

In the extreme case, our analysis also shows that the memory consumption can be reduced to O(log n) with as little as O(n log n) extra cost for forward computation.

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

2 code implementations3 Dec 2015 Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang

This paper describes both the API design and the system implementation of MXNet, and explains how embedding of both symbolic expression and tensor operation is handled in a unified fashion.

BIG-bench Machine Learning Clustering +2

Learning An Invariant Speech Representation

no code implementations16 Jun 2014 Georgios Evangelopoulos, Stephen Voinea, Chiyuan Zhang, Lorenzo Rosasco, Tomaso Poggio

Recognition of speech, and in particular the ability to generalize and learn from small sets of labelled examples like humans do, depends on an appropriate representation of the acoustic input.

General Classification Sound Classification +1

A Deep Representation for Invariance And Music Classification

no code implementations1 Apr 2014 Chiyuan Zhang, Georgios Evangelopoulos, Stephen Voinea, Lorenzo Rosasco, Tomaso Poggio

We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.

Classification General Classification +3

Multi-task Vector Field Learning

no code implementations NeurIPS 2012 Binbin Lin, Sen yang, Chiyuan Zhang, Jieping Ye, Xiaofei He

MTVFL has the following key properties: (1) the vector fields we learned are close to the gradient fields of the prediction functions; (2) within each task, the vector field is required to be as parallel as possible which is expected to span a low dimensional subspace; (3) the vector fields from all tasks share a low dimensional subspace.

Multi-Task Learning

Semi-supervised Regression via Parallel Field Regularization

no code implementations NeurIPS 2011 Binbin Lin, Chiyuan Zhang, Xiaofei He

To achieve this goal, we show that the second order smoothness measures the linearity of the function, and the gradient field of a linear function has to be a parallel vector field.


Cannot find the paper you are looking for? You can Submit a new open access paper.