Search Results for author: Dingli Yu

Found 8 papers, 3 papers with code

Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models

no code implementations26 Oct 2023 Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora

The paper develops a methodology for (a) designing and administering such an evaluation, and (b) automatic grading (plus spot-checking by humans) of the results using GPT-4 as well as the open LLaMA-2 70B model.

Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks

no code implementations3 Oct 2023 Greg Yang, Dingli Yu, Chen Zhu, Soufiane Hayou

By classifying infinite-width neural networks and identifying the *optimal* limit, Tensor Programs IV and V demonstrated a universal way, called $\mu$P, for *widthwise hyperparameter transfer*, i. e., predicting optimal hyperparameters of wide neural networks from narrow ones.

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

1 code implementation5 Nov 2022 Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora

Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net.

A Kernel-Based View of Language Model Fine-Tuning

1 code implementation11 Oct 2022 Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora

It has become standard to solve NLP tasks by fine-tuning pre-trained language models (LMs), especially in low-data settings.

Language Modelling

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic and Sound

no code implementations29 Sep 2021 Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora

Saliency methods seek to provide human-interpretable explanations for the output of machine learning model on a given input.

Enhanced Convolutional Neural Tangent Kernels

no code implementations3 Nov 2019 Zhiyuan Li, Ruosong Wang, Dingli Yu, Simon S. Du, Wei Hu, Ruslan Salakhutdinov, Sanjeev Arora

An exact algorithm to compute CNTK (Arora et al., 2019) yielded the finding that classification accuracy of CNTK on CIFAR-10 is within 6-7% of that of that of the corresponding CNN architecture (best figure being around 78%) which is interesting performance for a fixed kernel.

Data Augmentation regression

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

4 code implementations ICLR 2020 Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu

On VOC07 testbed for few-shot image classification tasks on ImageNet with transfer learning (Goyal et al., 2019), replacing the linear SVM currently used with a Convolutional NTK SVM consistently improves performance.

Few-Shot Image Classification General Classification +3

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

no code implementations ICLR 2020 Wei Hu, Zhiyuan Li, Dingli Yu

Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data.


Cannot find the paper you are looking for? You can Submit a new open access paper.