no code implementations • ICML 2020 • Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Vladimir Braverman, Joseph Gonzalez, Ion Stoica, Raman Arora
A key insight in the design of FedSketchedSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch.
no code implementations • 10 Jan 2025 • Thanh Nguyen-Tang, Raman Arora
We study the statistical complexity of offline decision-making with function approximation, establishing (near) minimax-optimal rates for stochastic contextual bandits and Markov decision processes.
no code implementations • 1 Nov 2024 • Thanh Nguyen-Tang, Raman Arora
We study learning in a dynamically evolving environment modeled as a Markov game between a learner and a strategic opponent that can adapt to the learner's strategies.
no code implementations • 18 Mar 2024 • Haque Ishfaq, Thanh Nguyen-Tang, Songtao Feng, Raman Arora, Mengdi Wang, Ming Yin, Doina Precup
We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation.
no code implementations • 6 Mar 2024 • Enayat Ullah, Michael Menart, Raef Bassily, Cristóbal Guzmán, Raman Arora
We also study PA-DP supervised learning with \textit{unlabeled} public samples.
no code implementations • 16 Feb 2024 • Yunjuan Wang, Hussein Hazimeh, Natalia Ponomareva, Alexey Kurakin, Ibrahim Hammoud, Raman Arora
To address this challenge, we first establish a generalization bound for the adversarial target loss, which consists of (i) terms related to the loss on the data, and (ii) a measure of worst-case domain divergence.
no code implementations • 6 Jan 2024 • Thanh Nguyen-Tang, Raman Arora
This result is surprising, given that the prior work suggested an unfavorable sample complexity of the RO-based algorithm compared to the VS-based algorithm, whereas posterior sampling is rarely considered in offline RL due to its explorative nature.
no code implementations • 22 Nov 2023 • Michael Menart, Enayat Ullah, Raman Arora, Raef Bassily, Cristóbal Guzmán
We further show that, without assuming the KL condition, the same gradient descent algorithm can achieve fast convergence to a stationary point when the gradient stays sufficiently large during the run of the algorithm.
no code implementations • 20 Jul 2023 • Enayat Ullah, Raman Arora
We give efficient unlearning algorithms for linear and prefix-sum query classes.
1 code implementation • 24 Feb 2023 • Thanh Nguyen-Tang, Raman Arora
We corroborate the statistical and computational efficiency of VIPeR with an empirical evaluation on a wide set of synthetic and real-world datasets.
no code implementations • 23 Nov 2022 • Thanh Nguyen-Tang, Ming Yin, Sunil Gupta, Svetha Venkatesh, Raman Arora
To the best of our knowledge, these are the first $\tilde{\mathcal{O}}(\frac{1}{K})$ bound and absolute zero sub-optimality bound respectively for offline RL with linear function approximation from adaptive data with partial coverage.
1 code implementation • 19 Aug 2022 • Jared Markowitz, Ryan W. Gardner, Ashley Llorens, Raman Arora, I-Jeng Wang
Without cost constraints, we find that pessimistic risk profiles can be used to reduce cost while improving total reward accumulation.
no code implementations • 18 Jun 2022 • Yunjuan Wang, Enayat Ullah, Poorya Mianjy, Raman Arora
Recent works show that adversarial examples exist for random neural networks [Daniely and Schacham, 2020] and that these examples can be found using a single step of gradient ascent [Bubeck et al., 2021].
no code implementations • 2 Jun 2022 • Raman Arora, Raef Bassily, Tomás González, Cristóbal Guzmán, Michael Menart, Enayat Ullah
We provide a new efficient algorithm that finds an $\tilde{O}\big(\big[\frac{\sqrt{d}}{n\varepsilon}\big]^{2/3}\big)$-stationary point in the finite-sum setting, where $n$ is the number of samples.
no code implementations • 6 May 2022 • Raman Arora, Raef Bassily, Cristóbal Guzmán, Michael Menart, Enayat Ullah
For this case, we close the gap in the existing work and show that the optimal rate is (up to log factors) $\Theta\left(\frac{\Vert w^*\Vert}{\sqrt{n}} + \min\left\{\frac{\Vert w^*\Vert}{\sqrt{n\epsilon}},\frac{\sqrt{\text{rank}}\Vert w^*\Vert}{n\epsilon}\right\}\right)$, where $\text{rank}$ is the rank of the design matrix.
no code implementations • 29 Sep 2021 • Jared Markowitz, Ryan Gardner, Ashley Llorens, Raman Arora, I-Jeng Wang
Standard deep reinforcement learning (DRL) agents aim to maximize expected reward, considering collected experiences equally in formulating a policy.
no code implementations • 25 Feb 2021 • Enayat Ullah, Tung Mai, Anup Rao, Ryan Rossi, Raman Arora
Our key contribution is the design of corresponding efficient unlearning algorithms, which are based on constructing a (maximal) coupling of Markov chains for the noisy SGD procedure.
no code implementations • NeurIPS 2020 • Poorya Mianjy, Raman Arora
We study dropout in two-layer neural networks with rectified linear unit (ReLU) activations.
1 code implementation • NeurIPS 2020 • Jeremias Sulam, Ramchandran Muthukumar, Raman Arora
Several recent results provide theoretical insights into the phenomena of adversarial examples.
no code implementations • 15 Jul 2020 • Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Ion Stoica, Vladimir Braverman, Joseph Gonzalez, Raman Arora
A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch.
1 code implementation • 17 Jun 2020 • Zhen Zhang, Chaokun Chang, Haibin Lin, Yida Wang, Raman Arora, Xin Jin
As such, we advocate that the real challenge of distributed training is for the network community to develop high-performance network transport to fully utilize the network capacity and achieve linear scale-out.
no code implementations • 16 Jun 2020 • Raman Arora, Teodor V. Marinov, Mehryar Mohri
We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm.
no code implementations • ICLR 2020 • Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro
In deep learning, we show that the data-dependent regularizer due to dropout directly controls the Rademacher complexity of the underlying class of deep neural networks.
no code implementations • 22 Feb 2020 • Raman Arora, Teodor V. Marinov, Enayat Ullah
In this paper, we revisit the problem of private stochastic convex optimization.
no code implementations • 30 Dec 2019 • Nils Holzenberger, Raman Arora
Canonical correlation analysis (CCA) is a popular technique for learning representations that are maximally correlated across multiple views in data.
1 code implementation • NeurIPS 2019 • Raman Arora, Teodor Vanislavov Marinov
We revisit two algorithms, matrix stochastic gradient (MSG) and $\ell_2$-regularized MSG (RMSG), that are instances of stochastic gradient descent (SGD) on a convex relaxation to principal component analysis (PCA).
no code implementations • NeurIPS 2019 • Raman Arora, Jalaj Upadhyay
In this paper, we study private sparsification of graphs.
no code implementations • NeurIPS 2019 • Raman Arora, Teodor V. Marinov, Mehryar Mohri
We give a new algorithm whose regret guarantee depends only on the domination number of the graph.
1 code implementation • 28 May 2019 • Poorya Mianjy, Raman Arora
We give a formal and complete characterization of the explicit regularizer induced by dropout in deep linear networks with squared loss.
2 code implementations • NeurIPS 2019 • Nikita Ivkin, Daniel Rothchild, Enayat Ullah, Vladimir Braverman, Ion Stoica, Raman Arora
Large-scale distributed training of neural networks is often limited by network bandwidth, wherein the communication time overwhelms the local computation time.
no code implementations • NeurIPS 2018 • Md Enayat Ullah, Poorya Mianjy, Teodor Vanislavov Marinov, Raman Arora
We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample complexity.
no code implementations • NeurIPS 2018 • Raman Arora, Vladimir Braverman, Jalaj Upadhyay
In this paper, we study the following robust low-rank matrix approximation problem: given a matrix $A \in \R^{n \times d}$, find a rank-$k$ matrix $B$, while satisfying differential privacy, such that $ \norm{ A - B }_p \leq \alpha \mathsf{OPT}_k(A) + \tau,$ where $\norm{ M }_p$ is the entry-wise $\ell_p$-norm and $\mathsf{OPT}_k(A):=\min_{\mathsf{rank}(X) \leq k} \norm{ A - X}_p$.
no code implementations • 21 Nov 2018 • Nils Holzenberger, Shruti Palaskar, Pranava Madhyastha, Florian Metze, Raman Arora
This shows it is possible to learn reliable representations across disparate, unaligned and noisy modalities, and encourages using the proposed approach on larger datasets.
no code implementations • NeurIPS 2018 • Raman Arora, Michael Dinitz, Teodor V. Marinov, Mehryar Mohri
We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other.
1 code implementation • 2 Aug 2018 • Enayat Ullah, Poorya Mianjy, Teodor V. Marinov, Raman Arora
We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample complexity.
no code implementations • ICML 2018 • Teodor Vanislavov Marinov, Poorya Mianjy, Raman Arora
We study streaming algorithms for principal component analysis (PCA) in noisy settings.
no code implementations • ICML 2018 • Poorya Mianjy, Raman Arora
We revisit convex relaxation based methods for stochastic optimization of principal component analysis (PCA).
no code implementations • ICML 2018 • Poorya Mianjy, Raman Arora, Rene Vidal
Algorithmic approaches endow deep learning systems with implicit bias that helps them generalize even in over-parametrized settings.
1 code implementation • 30 Mar 2018 • Chris Paxton, Yotam Barnoy, Kapil Katyal, Raman Arora, Gregory D. Hager
In this work, we propose a neural network architecture and associated planning algorithm that (1) learns a representation of the world useful for generating prospective futures after the application of high-level actions, (2) uses this generative model to simulate the result of sequences of high-level actions in a variety of environments, and (3) uses this same representation to evaluate these actions and perform tree search to find a sequence of high-level actions in a new environment.
no code implementations • 8 Nov 2017 • Chris Paxton, Kapil Katyal, Christian Rupprecht, Raman Arora, Gregory D. Hager
Ideally, we would combine the ability of machine learning to leverage big data for learning the semantics of a task, while using techniques from task planning to reliably generalize to new environment.
no code implementations • NeurIPS 2017 • Raman Arora, Teodor V. Marinov, Poorya Mianjy, Nathan Srebro
We propose novel first-order stochastic approximation algorithms for canonical correlation analysis (CCA).
3 code implementations • WS 2019 • Adrian Benton, Huda Khayrallah, Biman Gujral, Dee Ann Reisinger, Sheng Zhang, Raman Arora
We present Deep Generalized Canonical Correlation Analysis (DGCCA) -- a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other.
no code implementations • 29 Dec 2016 • Xingguo Li, Junwei Lu, Raman Arora, Jarvis Haupt, Han Liu, Zhaoran Wang, Tuo Zhao
We propose a general theory for studying the \xl{landscape} of nonconvex \xl{optimization} with underlying symmetric structures \tz{for a class of machine learning problems (e. g., low-rank matrix factorization, phase retrieval, and deep linear neural networks)}.
no code implementations • ICLR 2018 • Raman Arora, Amitabh Basu, Poorya Mianjy, Anirbit Mukherjee
In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU).
no code implementations • 10 Jul 2016 • Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, Mingyi Hong
In particular, we first show that for a family of quadratic minimization problems, the iteration complexity $\mathcal{O}(\log^2(p)\cdot\log(1/\epsilon))$ of the CBCD-type methods matches that of the GD methods in term of dependency on $p$, up to a $\log^2 p$ factor.
no code implementations • NeurIPS 2016 • Peter Schulam, Raman Arora
To answer these questions, we propose the Disease Trajectory Map (DTM), a novel probabilistic model that learns low-dimensional representations of sparse and irregularly sampled time series.
no code implementations • 25 May 2016 • Xingguo Li, Haoming Jiang, Jarvis Haupt, Raman Arora, Han Liu, Mingyi Hong, Tuo Zhao
Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility.
no code implementations • 9 May 2016 • Xingguo Li, Raman Arora, Han Liu, Jarvis Haupt, Tuo Zhao
We propose a stochastic variance reduced optimization algorithm for solving sparse learning problems with cardinality constraints.
1 code implementation • NAACL 2016 • Mo Yu, Mark Dredze, Raman Arora, Matthew Gormley
Modern NLP models rely heavily on engineered features, which often combine word and contextual information into complex lexical features.
1 code implementation • 2 Feb 2016 • Weiran Wang, Raman Arora, Karen Livescu, Jeff Bilmes
We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for learning while only one view is available for downstream tasks.
no code implementations • 7 Oct 2015 • Weiran Wang, Raman Arora, Karen Livescu, Nathan Srebro
Deep CCA is a recently proposed deep neural network extension to the traditional canonical correlation analysis (CCA), and has been successful for multi-view representation learning in several domains.
no code implementations • NeurIPS 2014 • Tuo Zhao, Mo Yu, Yiming Wang, Raman Arora, Han Liu
When the regularization function is block separable, we can solve the minimization problems in a randomized block coordinate descent (RBCD) manner.
no code implementations • NeurIPS 2013 • Raman Arora, Andrew Cotter, Nathan Srebro
We study PCA as a stochastic optimization problem and propose a novel stochastic approximation algorithm which we refer to as "Matrix Stochastic Gradient" (MSG), as well as a practical variant, Capped MSG.
no code implementations • NeurIPS 2009 • Raman Arora
An algorithm is presented for online learning of rotations.