Search Results for author: Nishanth Dikkala

Found 15 papers, 4 papers with code

The Power of External Memory in Increasing Predictive Model Capacity

no code implementations31 Jan 2023 Cenk Baykal, Dylan J Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network.

Language Modelling

A Theoretical View on Sparsely Activated Networks

no code implementations8 Aug 2022 Cenk Baykal, Nishanth Dikkala, Rina Panigrahy, Cyrus Rashtchian, Xin Wang

After representing LSH-based sparse networks with our model, we prove that sparse networks can match the approximation power of dense networks on Lipschitz functions.

Do More Negative Samples Necessarily Hurt in Contrastive Learning?

no code implementations3 May 2022 Pranjal Awasthi, Nishanth Dikkala, Pritish Kamath

Recent investigations in noise contrastive estimation suggest, both empirically as well as theoretically, that while having more "negative samples" in the contrastive loss improves downstream classification performance initially, beyond a threshold, it hurts downstream performance due to a "collision-coverage" trade-off.

Contrastive Learning

Statistical Estimation from Dependent Data

no code implementations20 Jul 2021 Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Surbhi Goel, Anthimos Vardis Kandiros

We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e. g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies.

regression text-classification +1

For Manifold Learning, Deep Neural Networks can be Locality Sensitive Hash Functions

1 code implementation11 Mar 2021 Nishanth Dikkala, Gal Kaplun, Rina Panigrahy

We provide theoretical and empirical evidence that neural representations can be viewed as LSH-like functions that map each input to an embedding that is a function of solely the informative $\gamma$ and invariant to $\theta$, effectively recovering the manifold identifier $\gamma$.

One-Shot Learning

Minimax Estimation of Conditional Moment Models

1 code implementation NeurIPS 2020 Nishanth Dikkala, Greg Lewis, Lester Mackey, Vasilis Syrgkanis

We develop an approach for estimating models described via conditional moment restrictions, with a prototypical application being non-parametric instrumental variable regression.

Learning Ising models from one or multiple samples

no code implementations20 Apr 2020 Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Anthimos Vardis Kandiros

As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold.

Logistic-Regression with peer-group effects via inference in higher order Ising models

no code implementations18 Mar 2020 Constantinos Daskalakis, Nishanth Dikkala, Ioannis Panageas

In this work we study extensions of these to models with higher-order sufficient statistics, modeling behavior on a social network with peer-group effects.

regression

Learning from weakly dependent data under Dobrushin's condition

no code implementations21 Jun 2019 Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Siddhartha Jayanti

Indeed, we show that the standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in our setting.

Generalization Bounds Learning Theory +2

Regression from Dependent Observations

no code implementations8 May 2019 Constantinos Daskalakis, Nishanth Dikkala, Ioannis Panageas

The standard linear and logistic regression models assume that the response variables are independent, but share the same linear relationship to their corresponding vectors of covariates.

regression

HOGWILD!-Gibbs can be PanAccurate

no code implementations NeurIPS 2018 Constantinos Daskalakis, Nishanth Dikkala, Siddhartha Jayanti

Hence, the expectation of any function that is Lipschitz with respect to a power of the Hamming distance, can be estimated with a bias that grows logarithmically in $n$.

From Soft Classifiers to Hard Decisions: How fair can we be?

1 code implementation3 Oct 2018 Ran Canetti, Aloni Cohen, Nishanth Dikkala, Govind Ramnarayan, Sarah Scheffler, Adam Smith

We study the feasibility of achieving various fairness properties by post-processing calibrated scores, and then show that deferring post-processors allow for more fairness conditions to hold on the final decision.

Decision Making Fairness

Testing Symmetric Markov Chains from a Single Trajectory

no code implementations22 Apr 2017 Constantinos Daskalakis, Nishanth Dikkala, Nick Gravin

We initiate the study of Markov chain testing, assuming access to a single trajectory of a Markov Chain.

Testing Ising Models

no code implementations9 Dec 2016 Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution?

Cannot find the paper you are looking for? You can Submit a new open access paper.