no code implementations • 24 Jan 2025 • Jing Wang, Anna Choromanska
In this paper, we provide an extensive summary of the theoretical foundations of optimization methods in DL, including presenting various methodologies, their convergence analyses, and generalization abilities.
1 code implementation • 9 Jan 2025 • Haoran Zhu, Zhenyuan Dong, Kristi Topollai, Anna Choromanska
In this paper, we present AD-L-JEPA (aka Autonomous Driving with LiDAR data via a Joint Embedding Predictive Architecture), a novel self-supervised pre-training framework for autonomous driving with LiDAR data that, as opposed to existing methods, is neither generative nor contrastive.
no code implementations • 18 May 2024 • Haoze He, Jing Wang, Anna Choromanska
This work focuses on the decentralized deep learning optimization framework.
1 code implementation • 7 Mar 2024 • Tolga Dimlioglu, Anna Choromanska
We study distributed training of deep learning models in time-constrained environments.
no code implementations • 8 Oct 2022 • Haoran Zhu, Maryam Majzoubi, Arihant Jain, Anna Choromanska
Our algorithm, which we call TAME (Task-Agnostic continual learning using Multiple Experts), automatically detects the shift in data distributions and switches between task expert networks in an online manner.
no code implementations • 26 Sep 2022 • Shihong Fang, Haoran Zhu, Devansh Bisla, Anna Choromanska, Satish Ravindran, Dongyin Ren, Ryan Wu
The core of our approach is the novel detect-then-segment method for raw radar signals.
1 code implementation • 20 Jan 2022 • Devansh Bisla, Jing Wang, Anna Choromanska
In this paper, we study the sharpness of a deep learning (DL) loss landscape around local minima in order to reveal systematic mechanisms underlying the generalization abilities of DL models.
no code implementations • 30 Nov 2021 • Yunfei Teng, Jing Wang, Anna Choromanska
Modern deep learning (DL) architectures are trained using variants of the SGD algorithm that is run with a $\textit{manually}$ defined learning rate schedule, i. e., the learning rate is dropped at the pre-defined epochs, typically when the training loss is expected to saturate.
no code implementations • 5 May 2021 • Devansh Bisla, Apoorva Nandini Saridena, Anna Choromanska
It is however unclear how to extend these measures to DNNs and therefore the existing analyses are applicable to simple neural networks, which are not used in practice, e. g., linear or shallow ones or otherwise multi-layer perceptrons.
1 code implementation • 25 Nov 2020 • Yunfei Teng, Anna Choromanska, Murray Campbell, Songtao Lu, Parikshit Ram, Lior Horesh
We study the principal directions of the trajectory of the optimizer after convergence and show that traveling along a few top principal directions can quickly bring the parameters outside the cone but this is not the case for the remaining directions.
no code implementations • 21 Nov 2020 • Shihong Fang, Anna Choromanska
In this paper we design a backdoor attack that alters the saliency map produced by the network for an input image only with injected trigger that is invisible to the naked eye while maintaining the prediction accuracy.
no code implementations • 3 Nov 2020 • Jing Wang, Anna Choromanska
The update of the proposed method, that we refer to as Stochastic Partition Function Bound (SPFB), resembles scaled stochastic gradient descent where the scaling factor relies on a second order term that is however different from the Hessian.
no code implementations • 18 Sep 2020 • Shihong Fang, Anna Choromanska
The other is the phenomena of network overfitting to the simplest and most informative input.
no code implementations • 25 Sep 2019 • Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael I. Jordan
We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.
1 code implementation • ICML 2020 • Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael. I. Jordan
We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.
1 code implementation • NeurIPS 2019 • Yunfei Teng, Wenbo Gao, Francois Chalus, Anna Choromanska, Donald Goldfarb, Adrian Weller
Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).
no code implementations • 24 May 2019 • Maryam Majzoubi, Anna Choromanska
In this paper we develop the LdSM algorithm for the construction and training of multi-label decision trees, where in every node of the tree we optimize a novel objective function that favors balanced splits, maintains high class purity of children nodes, and allows sending examples to multiple directions but with a penalty that prevents tree over-growth.
no code implementations • 16 Feb 2019 • Devansh Bisla, Anna Choromanska, Jennifer A. Stein, David Polsky, Russell Berman
Melanoma is one of the ten most common cancers in the US.
no code implementations • 12 Nov 2018 • Naman Patel, Apoorva Nandini Saridena, Anna Choromanska, Prashanth Krishnamurthy, Farshad Khorrami
The paper proposes an on-line monitoring framework for continuous real-time safety/security in learning-based control systems (specifically application to a unmanned ground vehicle).
1 code implementation • 24 Jun 2018 • Anna Choromanska, Benjamin Cowen, Sadhana Kumaravel, Ronny Luss, Mattia Rigotti, Irina Rish, Brian Kingsbury, Paolo DiAchille, Viatcheslav Gurev, Ravi Tejwani, Djallel Bouneffouf
Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function.
no code implementations • 24 May 2018 • Devansh Bisla, Anna Choromanska
In many machine learning applications, from medical diagnostics to autonomous driving, the availability of prior knowledge can be used to improve the predictive performance of learning algorithms and incorporate `physical,' `domain knowledge,' or `common sense' concepts into training of machine learning systems as well as verify constraints/properties of the systems.
no code implementations • 13 Feb 2018 • Benjamin Cowen, Apoorva Nandini Saridena, Anna Choromanska
We propose an efficient algorithm for the generalized sparse coding (SC) inference problem.
no code implementations • 10 Feb 2018 • Yunfei Teng, Anna Choromanska, Mariusz Bojarski
However, it does not explicitly enforce $F_{BA}$ to be an inverse operation to $F_{AB}$.
no code implementations • 8 Feb 2018 • Shervin Minaee, Yao Wang, Anna Choromanska, Sohae Chung, Xiuyuan Wang, Els Fieremans, Steven Flanagan, Joseph Rath, Yvonne W. Lui
Mild traumatic brain injury is a growing public health problem with an estimated incidence of over 1. 7 million people annually in US.
8 code implementations • 25 Apr 2017 • Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, Urs Muller
This eliminates the need for human engineers to anticipate what is important in an image and foresee all the necessary rules for safe driving.
4 code implementations • 16 Nov 2016 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, Karol Zieba
We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction.
2 code implementations • 6 Nov 2016 • Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina
This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.
no code implementations • 19 Oct 2016 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler, Anne Morvan, Nourhan Sakr, Tamas Sarlos, Jamal Atif
We consider an efficient computational framework for speeding up several machine learning algorithms with almost no loss of accuracy.
no code implementations • ICML 2017 • Yacine Jernite, Anna Choromanska, David Sontag
We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time.
no code implementations • 17 May 2016 • Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski
We analyze the performance of the top-down multiclass classification algorithm for decision tree learning called LOMtree, recently proposed in the literature Choromanska and Langford (2014) for solving efficiently classification problems with very large number of classes.
no code implementations • 16 Nov 2015 • Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun
We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.
10 code implementations • NeurIPS 2015 • Sixin Zhang, Anna Choromanska, Yann Lecun
We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.
1 code implementation • 30 Nov 2014 • Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann Lecun
We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum.
no code implementations • 26 Oct 2014 • Apoorv Agarwal, Anna Choromanska, Krzysztof Choromanski
In this paper, we compare three initialization schemes for the KMEANS clustering algorithm: 1) random initialization (KMEANSRAND), 2) KMEANS++, and 3) KMEANSD++.
no code implementations • 26 Oct 2014 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun
We consider supervised learning with random decision trees, where the tree construction is completely random.
no code implementations • NeurIPS 2015 • Anna Choromanska, John Langford
We develop top-down tree construction approaches for constructing logarithmic depth trees.
no code implementations • 22 Sep 2013 • Anna Choromanska, Tony Jebara
Recently a majorization method for optimizing partition functions of log-linear models was proposed alongside a novel quadratic variational upper-bound.
no code implementations • 5 Sep 2013 • Aleksandr Y. Aravkin, Anna Choromanska, Tony Jebara, Dimitri Kanevsky
Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorably in comparison to state-of-the-art techniques.
no code implementations • NeurIPS 2012 • Tony Jebara, Anna Choromanska
The partition function plays a key role in probabilistic modeling including conditional random fields, graphical models, and maximum likelihood estimation.