A common explanation for the failure of out-of-distribution (OOD) generalization is that the model trained with empirical risk minimization (ERM) learns spurious features instead of the desired invariant features.
Such object navigation tasks usually require large-scale training in visual environments with labeled objects, which generalizes poorly to novel objects in unknown environments.
However, since the local data is inaccessible to the server under federated learning, attackers may easily poison the training data of the local client to build a backdoor in the agent without notice.
Building a conversational embodied agent to execute real-life tasks has been a long-standing yet quite challenging research goal, as it requires effective human-agent communication, multi-modal understanding, long-range sequential decision making, etc.
In this paper, we resolve this issue and derive the first high-probability bounds for the private stochastic method with clipping.
The AutoAttack (AA) has been the most reliable method to evaluate adversarial robustness when considerable computational resources are available.
Recently, there has been a growing surge of interest in enabling machine learning systems to generalize well to Out-of-Distribution (OOD) data.
Data privacy is a central problem for embodied agents that can perceive the environment, communicate with humans, and act in the real world.
We show that stochastic acceleration can be achieved under the perturbed iterate framework (Mania et al., 2017) in asynchronous lock-free optimization, which leads to the optimal incremental gradient complexity for finite-sum objectives.
However, when tested on attacks different from the given attack simulated in training, the robustness may drop significantly (e. g., even worse than no reweighting).
In convex optimization, the problem of finding near-stationary points has not been adequately studied yet, unlike other optimality measures such as the function value.
Specifically, instead of tackling the original objective directly, we construct a shifted objective function that has the same minimizer as the original objective and encodes both the smoothness and strong convexity of the original objective in an interpolation condition.
Edit-distance-based string similarity search has many applications such as spell correction, data de-duplication, and sequence alignment.
In particular, at the high compression ratio end, HSQ provides a low per-iteration communication cost of $O(\log d)$, which is favorable for federated learning.
Stochastic Gradient Descent (SGD) with Nesterov's momentum is a widely used optimizer in deep learning, which is observed to have excellent generalization performance.
Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including $L_2$-ALSH, Sign-ALSH and Simple-LSH have been proposed.
This paper proposes an accelerated proximal stochastic variance reduced gradient (ASVRG) method, in which we design a simple and effective momentum acceleration trick.
Recent years have witnessed exciting progress in the study of stochastic variance reduced gradient methods (e. g., SVRG, SAGA), their accelerated variants (e. g, Katyusha) and their extensions in many different settings (e. g., online, sparse, asynchronous, distributed).
In this paper, we propose a simple variant of the original SVRG, called variance reduced stochastic gradient descent (VR-SGD).
In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of stochastic variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct.