Search Results for author: Johan S. Wind

Found 3 papers, 2 papers with code

Asymmetric matrix sensing by gradient descent with small random initialization

no code implementations • 4 Sep 2023 • Johan S. Wind

The dynamics of gradient descent for matrix sensing can be reduced to this formulation, yielding a novel proof of asymmetric matrix sensing with factorized gradient descent.

Paper
Add Code

Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks

1 code implementation • 13 Jul 2023 • Johan S. Wind, Vegard Antun, Anders C. Hansen

In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA).

Paper
Code

RWKV: Reinventing RNNs for the Transformer Era

5 code implementations • 22 May 2023 • Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Stella Biderman, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Jiaju Lin, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Guangyu Song, Xiangru Tang, Bolun Wang, Johan S. Wind, Stanislaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Qinghua Zhou, Jian Zhu, Rui-Jie Zhu

This work presents a significant step towards reconciling trade-offs between computational efficiency and model performance in sequence processing tasks.

Ranked #22 on Natural Language Inference on WNLI

Computational Efficiency Natural Language Inference

11,716

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.