2 code implementations • 2 May 2023 • Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna, Pratik Chaudhari
We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training.
2 code implementations • 31 Oct 2022 • Rahul Ramesh, Jialin Mao, Itay Griniasty, Rubing Yang, Han Kheng Teoh, Mark Transtrum, James P. Sethna, Pratik Chaudhari
We develop information geometric techniques to understand the representations learned by deep networks when they are trained on different tasks using supervised, meta-, semi-supervised and contrastive learning.
1 code implementation • 21 May 2022 • Zhiqi Bu, Jialin Mao, Shiyun Xu
Large convolutional neural networks (CNN) can be difficult to train in the differentially private (DP) regime, since the optimization algorithms require a computationally expensive operation, known as the per-sample gradient clipping.
1 code implementation • 27 Oct 2021 • Rubing Yang, Jialin Mao, Pratik Chaudhari
This structure is mirrored in a network trained on this data: we show that the Hessian and the Fisher Information Matrix (FIM) have eigenvalues that are spread uniformly over exponentially large ranges.
no code implementations • ICLR 2018 • Sean Welleck, Zixin Yao, Yu Gai, Jialin Mao, Zheng Zhang, Kyunghyun Cho
In this paper, we propose a novel multiset loss function by viewing this problem from the perspective of sequential decision making.
no code implementations • NeurIPS 2017 • Sean Welleck, Jialin Mao, Kyunghyun Cho, Zheng Zhang
Humans process visual scenes selectively and sequentially using attention.