Search Results for author: Kamil Nar

Found 6 papers, 2 papers with code

Step Size Matters in Deep Learning

2 code implementations • NeurIPS 2018 • Kamil Nar, S. Shankar Sastry

To elucidate the effects of the step size on training of neural networks, we study the gradient descent algorithm as a discrete-time dynamical system, and by analyzing the Lyapunov stability of different solutions, we show the relationship between the step size of the algorithm and the solutions that can be obtained with this algorithm.

Paper
Code

Persistency of Excitation for Robustness of Neural Networks

1 code implementation • 4 Nov 2019 • Kamil Nar, S. Shankar Sastry

While training a neural network, the iterative optimization algorithm involved also creates an online learning problem, and consequently, correct estimation of the optimal parameters requires persistent excitation of the network weights.

Multi-Armed Bandits

Paper
Code

Residual Networks: Lyapunov Stability and Convex Decomposition

no code implementations • 22 Mar 2018 • Kamil Nar, Shankar Sastry

While training error of most deep neural networks degrades as the depth of the network increases, residual networks appear to be an exception.

Paper
Add Code

Cross-Entropy Loss Leads To Poor Margins

no code implementations • ICLR 2019 • Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran

In this work, we study the binary classification of linearly separable datasets and show that linear classifiers could also have decision boundaries that lie close to their training dataset if cross-entropy loss is used for training.

Binary Classification

Paper
Add Code

Cross-Entropy Loss and Low-Rank Features Have Responsibility for Adversarial Examples

no code implementations • 24 Jan 2019 • Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran

We show that differential training can ensure a large margin between the decision boundary of the neural network and the points in the training dataset.

Binary Classification

Paper
Add Code

Learning Unstable Dynamical Systems with Time-Weighted Logarithmic Loss

no code implementations • 10 Jul 2020 • Kamil Nar, Yuan Xue, Andrew M. Dai

When training the parameters of a linear dynamical model, the gradient descent algorithm is likely to fail to converge if the squared-error loss is used as the training loss function.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.