Compared with the naive parallel-chain SGLD that updates multiple particles independently, ensemble methods update particles with their interactions.
Existing analysis is limited to the Bayesian setting, which assumes a correct model and exact Bayesian posterior distribution.
First, we provide a new second-order Jensen inequality, which has the repulsion term based on the loss function.
If the black-box function varies with time, then time-varying Bayesian optimization is a promising framework.
Another example is the Stein points (SP) method, which minimizes kernelized Stein discrepancy directly.
In this paper, based on Zellner's optimization and variational formulation of Bayesian inference, we propose an outlier-robust pseudo-Bayesian variational method by replacing the Kullback-Leibler divergence used for data fitting to a robust divergence such as the beta- and gamma-divergences.
Exponential family distributions are highly useful in machine learning since their calculation can be performed efficiently through natural parameters.