no code implementations • 8 Feb 2023 • Tomoya Murata, Taiji Suzuki
In the previous work, the best known utility bound is $\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP}))$ in terms of the squared full gradient norm, which is achieved by Differential Private Gradient Descent (DP-GD) as an instance, where $n$ is the sample size, $d$ is the problem dimensionality and $\varepsilon_\mathrm{DP}$ is the differential privacy parameter.
no code implementations • 1 Sep 2022 • Kazusato Oko, Shunta Akiyama, Tomoya Murata, Taiji Suzuki
While variance reduction methods have shown great success in solving large scale optimization problems, many of them suffer from accumulated errors and, therefore, should periodically require the full gradient computation.
no code implementations • 12 Feb 2022 • Tomoya Murata, Taiji Suzuki
In recent centralized nonconvex distributed learning and federated learning, local methods are one of the promising approaches to reduce communication time.
no code implementations • 5 Feb 2021 • Tomoya Murata, Taiji Suzuki
Recently, local SGD has got much attention and been extensively studied in the distributed learning community to overcome the communication bottleneck problem.
no code implementations • 19 Jun 2020 • Tomoya Murata, Taiji Suzuki
In this paper, we study importance labeling problem, in which we are given many unlabeled data and select a limited number of data to be labeled from the unlabeled data, and then a learning algorithm is executed on the selected one.
no code implementations • 29 May 2019 • Tomoya Murata, Taiji Suzuki
Several work has shown that {\it{sparsified}} stochastic gradient descent method (SGD) with {\it{error feedback}} asymptotically achieves the same rate as (non-sparsified) parallel SGD.
no code implementations • NeurIPS 2018 • Tomoya Murata, Taiji Suzuki
We develop new stochastic gradient methods for efficiently solving sparse linear regression in a partial attribute observation setting, where learners are only allowed to observe a fixed number of actively chosen attributes per example at training and prediction times.
no code implementations • 26 Aug 2018 • Taiji Suzuki, Hiroshi Abe, Tomoya Murata, Shingo Horiuchi, Kotaro Ito, Tokuma Wachi, So Hirai, Masatoshi Yukishima, Tomoaki Nishimura
The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound.
no code implementations • NeurIPS 2017 • Tomoya Murata, Taiji Suzuki
In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings.
no code implementations • 8 Mar 2016 • Tomoya Murata, Taiji Suzuki
We consider a composite convex minimization problem associated with regularized empirical risk minimization, which often arises in machine learning.