no code implementations • 26 Jun 2024 • Pranjal Awasthi, Nishanth Dikkala, Pritish Kamath, Raghu Meka

A core component present in many successful neural network architectures, is an MLP block of two fully connected layers with a non-linear activation in between.

no code implementations • 13 Jun 2024 • Mehran Kazemi, Nishanth Dikkala, Ankit Anand, Petar Devic, Ishita Dasgupta, Fangyu Liu, Bahare Fatemi, Pranjal Awasthi, Dee Guo, Sreenivas Gollapudi, Ahmed Qureshi

With the continuous advancement of large language models (LLMs), it is essential to create new benchmarks to effectively evaluate their expanding capabilities and identify areas for improvement.

no code implementations • 31 Jan 2023 • Cenk Baykal, Dylan J Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network.

no code implementations • 8 Aug 2022 • Cenk Baykal, Nishanth Dikkala, Rina Panigrahy, Cyrus Rashtchian, Xin Wang

After representing LSH-based sparse networks with our model, we prove that sparse networks can match the approximation power of dense networks on Lipschitz functions.

no code implementations • 3 May 2022 • Pranjal Awasthi, Nishanth Dikkala, Pritish Kamath

Recent investigations in noise contrastive estimation suggest, both empirically as well as theoretically, that while having more "negative samples" in the contrastive loss improves downstream classification performance initially, beyond a threshold, it hurts downstream performance due to a "collision-coverage" trade-off.

no code implementations • 20 Jul 2021 • Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Surbhi Goel, Anthimos Vardis Kandiros

We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e. g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies.

1 code implementation • 11 Mar 2021 • Nishanth Dikkala, Gal Kaplun, Rina Panigrahy

We provide theoretical and empirical evidence that neural representations can be viewed as LSH-like functions that map each input to an embedding that is a function of solely the informative $\gamma$ and invariant to $\theta$, effectively recovering the manifold identifier $\gamma$.

1 code implementation • NeurIPS 2020 • Nishanth Dikkala, Greg Lewis, Lester Mackey, Vasilis Syrgkanis

We develop an approach for estimating models described via conditional moment restrictions, with a prototypical application being non-parametric instrumental variable regression.

no code implementations • 20 Apr 2020 • Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Anthimos Vardis Kandiros

As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold.

no code implementations • 18 Mar 2020 • Constantinos Daskalakis, Nishanth Dikkala, Ioannis Panageas

In this work we study extensions of these to models with higher-order sufficient statistics, modeling behavior on a social network with peer-group effects.

no code implementations • 21 Jun 2019 • Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Siddhartha Jayanti

Indeed, we show that the standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in our setting.

no code implementations • 8 May 2019 • Constantinos Daskalakis, Nishanth Dikkala, Ioannis Panageas

The standard linear and logistic regression models assume that the response variables are independent, but share the same linear relationship to their corresponding vectors of covariates.

no code implementations • NeurIPS 2018 • Constantinos Daskalakis, Nishanth Dikkala, Siddhartha Jayanti

Hence, the expectation of any function that is Lipschitz with respect to a power of the Hamming distance, can be estimated with a bias that grows logarithmically in $n$.

1 code implementation • 3 Oct 2018 • Ran Canetti, Aloni Cohen, Nishanth Dikkala, Govind Ramnarayan, Sarah Scheffler, Adam Smith

We study the feasibility of achieving various fairness properties by post-processing calibrated scores, and then show that deferring post-processors allow for more fairness conditions to hold on the final decision.

1 code implementation • NeurIPS 2017 • Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

We prove near-tight concentration of measure for polynomial functions of the Ising model under high temperature.

no code implementations • 22 Apr 2017 • Constantinos Daskalakis, Nishanth Dikkala, Nick Gravin

We initiate the study of Markov chain testing, assuming access to a single trajectory of a Markov Chain.

no code implementations • 9 Dec 2016 • Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution?

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.