no code implementations • 10 Jul 2020 • Kamil Nar, Yuan Xue, Andrew M. Dai
When training the parameters of a linear dynamical model, the gradient descent algorithm is likely to fail to converge if the squared-error loss is used as the training loss function.
1 code implementation • 4 Nov 2019 • Kamil Nar, S. Shankar Sastry
While training a neural network, the iterative optimization algorithm involved also creates an online learning problem, and consequently, correct estimation of the optimal parameters requires persistent excitation of the network weights.
no code implementations • ICLR 2019 • Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran
In this work, we study the binary classification of linearly separable datasets and show that linear classifiers could also have decision boundaries that lie close to their training dataset if cross-entropy loss is used for training.
no code implementations • 24 Jan 2019 • Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran
We show that differential training can ensure a large margin between the decision boundary of the neural network and the points in the training dataset.
2 code implementations • NeurIPS 2018 • Kamil Nar, S. Shankar Sastry
To elucidate the effects of the step size on training of neural networks, we study the gradient descent algorithm as a discrete-time dynamical system, and by analyzing the Lyapunov stability of different solutions, we show the relationship between the step size of the algorithm and the solutions that can be obtained with this algorithm.
no code implementations • 22 Mar 2018 • Kamil Nar, Shankar Sastry
While training error of most deep neural networks degrades as the depth of the network increases, residual networks appear to be an exception.