Hyper-Regularization: An Adaptive Choice for the Learning Rate in Gradient Descent

ICLR 2019 Guangzeng XieHao JinDachao LinZhihua Zhang

We present a novel approach for adaptively selecting the learning rate in gradient descent methods. Specifically, we impose a regularization term on the learning rate via a generalized distance, and cast the joint updating process of the parameter and the learning rate into a maxmin problem... (read more)

PDF Abstract


No code implementations yet. Submit your code now


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.