First-order methods for stochastic optimization have undeniable relevance, in part due to their pivotal role in machine learning.
When $\Omega$ is a Tsallis negentropy with parameter $\alpha$, we obtain "deformed exponential families," which include $\alpha$-entmax and sparsemax ($\alpha$ = 2) as particular cases.
Visual attention mechanisms are a key component of neural network models for computer vision.
Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).
Ranked #20 on Visual Question Answering on VQA v2 test-dev
In this setting, we use a bidirectional RNN with attention for contextual rescoring and introduce a training target that uses the IoU with ground truth to maximize AP for the given set of detections.
Our contribution is a communication-efficient distributed algorithm that finds a vector $x^\star$ minimizing the sum of all the functions.
Optimization and Control Information Theory Information Theory