Search Results for author: Anna Choromanska

Found 36 papers, 11 papers with code

GRAWA: Gradient-based Weighted Averaging for Distributed Training of Deep Learning Models

1 code implementation7 Mar 2024 Tolga Dimlioglu, Anna Choromanska

We study distributed training of deep learning models in time-constrained environments.

TAME: Task Agnostic Continual Learning using Multiple Experts

no code implementations8 Oct 2022 Haoran Zhu, Maryam Majzoubi, Arihant Jain, Anna Choromanska

Our algorithm, which we call TAME (Task-Agnostic continual learning using Multiple Experts), automatically detects the shift in data distributions and switches between task expert networks in an online manner.

Continual Learning

Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape

1 code implementation20 Jan 2022 Devansh Bisla, Jing Wang, Anna Choromanska

In this paper, we study the sharpness of a deep learning (DL) loss landscape around local minima in order to reveal systematic mechanisms underlying the generalization abilities of DL models.

AutoDrop: Training Deep Learning Models with Automatic Learning Rate Drop

no code implementations30 Nov 2021 Yunfei Teng, Jing Wang, Anna Choromanska

Modern deep learning (DL) architectures are trained using variants of the SGD algorithm that is run with a $\textit{manually}$ defined learning rate schedule, i. e., the learning rate is dropped at the pre-defined epochs, typically when the training loss is expected to saturate.

A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs

no code implementations5 May 2021 Devansh Bisla, Apoorva Nandini Saridena, Anna Choromanska

It is however unclear how to extend these measures to DNNs and therefore the existing analyses are applicable to simple neural networks, which are not used in practice, e. g., linear or shallow ones or otherwise multi-layer perceptrons.

Autonomous Driving

Overcoming Catastrophic Forgetting via Direction-Constrained Optimization

1 code implementation25 Nov 2020 Yunfei Teng, Anna Choromanska, Murray Campbell, Songtao Lu, Parikshit Ram, Lior Horesh

We study the principal directions of the trajectory of the optimizer after convergence and show that traveling along a few top principal directions can quickly bring the parameters outside the cone but this is not the case for the remaining directions.

Continual Learning

Backdoor Attacks on the DNN Interpretation System

no code implementations21 Nov 2020 Shihong Fang, Anna Choromanska

In this paper we design a backdoor attack that alters the saliency map produced by the network for an input image only with injected trigger that is invisible to the naked eye while maintaining the prediction accuracy.

Backdoor Attack

SGB: Stochastic Gradient Bound Method for Optimizing Partition Functions

no code implementations3 Nov 2020 Jing Wang, Anna Choromanska

The update of the proposed method, that we refer to as Stochastic Partition Function Bound (SPFB), resembles scaled stochastic gradient descent where the scaling factor relies on a second order term that is however different from the Hessian.

Multi-modal Experts Network for Autonomous Driving

no code implementations18 Sep 2020 Shihong Fang, Anna Choromanska

The other is the phenomena of network overfitting to the simplest and most informative input.

Autonomous Driving

Behavior-Guided Reinforcement Learning

no code implementations25 Sep 2019 Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael I. Jordan

We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.

reinforcement-learning Reinforcement Learning (RL)

Learning to Score Behaviors for Guided Policy Optimization

1 code implementation ICML 2020 Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael. I. Jordan

We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space.

Efficient Exploration Imitation Learning +2

LdSM: Logarithm-depth Streaming Multi-label Decision Trees

no code implementations24 May 2019 Maryam Majzoubi, Anna Choromanska

In this paper we develop the LdSM algorithm for the construction and training of multi-label decision trees, where in every node of the tree we optimize a novel objective function that favors balanced splits, maintains high class purity of children nodes, and allows sending examples to multiple directions but with a penalty that prevents tree over-growth.

General Classification Multi-Label Classification

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

1 code implementation NeurIPS 2019 Yunfei Teng, Wenbo Gao, Francois Chalus, Anna Choromanska, Donald Goldfarb, Adrian Weller

Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).

Distributed Optimization

Adversarial Learning-Based On-Line Anomaly Monitoring for Assured Autonomy

no code implementations12 Nov 2018 Naman Patel, Apoorva Nandini Saridena, Anna Choromanska, Prashanth Krishnamurthy, Farshad Khorrami

The paper proposes an on-line monitoring framework for continuous real-time safety/security in learning-based control systems (specifically application to a unmanned ground vehicle).

Anomaly Detection Generative Adversarial Network +1

Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

1 code implementation24 Jun 2018 Anna Choromanska, Benjamin Cowen, Sadhana Kumaravel, Ronny Luss, Mattia Rigotti, Irina Rish, Brian Kingsbury, Paolo DiAchille, Viatcheslav Gurev, Ravi Tejwani, Djallel Bouneffouf

Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function.

VisualBackProp for learning using privileged information with CNNs

no code implementations24 May 2018 Devansh Bisla, Anna Choromanska

In many machine learning applications, from medical diagnostics to autonomous driving, the availability of prior knowledge can be used to improve the predictive performance of learning algorithms and incorporate `physical,' `domain knowledge,' or `common sense' concepts into training of machine learning systems as well as verify constraints/properties of the systems.

Autonomous Driving BIG-bench Machine Learning +4

LSALSA: Accelerated Source Separation via Learned Sparse Coding

no code implementations13 Feb 2018 Benjamin Cowen, Apoorva Nandini Saridena, Anna Choromanska

We propose an efficient algorithm for the generalized sparse coding (SC) inference problem.

Invertible Autoencoder for domain adaptation

no code implementations10 Feb 2018 Yunfei Teng, Anna Choromanska, Mariusz Bojarski

However, it does not explicitly enforce $F_{BA}$ to be an inverse operation to $F_{AB}$.

Autonomous Driving Domain Adaptation +2

A Deep Unsupervised Learning Approach Toward MTBI Identification Using Diffusion MRI

no code implementations8 Feb 2018 Shervin Minaee, Yao Wang, Anna Choromanska, Sohae Chung, Xiuyuan Wang, Els Fieremans, Steven Flanagan, Joseph Rath, Yvonne W. Lui

Mild traumatic brain injury is a growing public health problem with an estimated incidence of over 1. 7 million people annually in US.

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

8 code implementations25 Apr 2017 Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, Urs Muller

This eliminates the need for human engineers to anticipate what is important in an image and foresee all the necessary rules for safe driving.

Autonomous Driving Self-Driving Cars

VisualBackProp: efficient visualization of CNNs

4 code implementations16 Nov 2016 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry Jackel, Urs Muller, Karol Zieba

We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction.

Self-Driving Cars

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations6 Nov 2016 Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation

no code implementations ICML 2017 Yacine Jernite, Anna Choromanska, David Sontag

We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time.

Density Estimation General Classification +4

On the boosting ability of top-down decision tree learning algorithm for multiclass classification

no code implementations17 May 2016 Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski

We analyze the performance of the top-down multiclass classification algorithm for decision tree learning called LOMtree, recently proposed in the literature Choromanska and Langford (2014) for solving efficiently classification problems with very large number of classes.

General Classification

Binary embeddings with structured hashed projections

no code implementations16 Nov 2015 Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun

We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.

LEMMA

Deep learning with Elastic Averaging SGD

10 code implementations NeurIPS 2015 Sixin Zhang, Anna Choromanska, Yann Lecun

We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.

Image Classification Stochastic Optimization

The Loss Surfaces of Multilayer Networks

1 code implementation30 Nov 2014 Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann Lecun

We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum.

Differentially- and non-differentially-private random decision trees

no code implementations26 Oct 2014 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun

We consider supervised learning with random decision trees, where the tree construction is completely random.

Notes on using Determinantal Point Processes for Clustering with Applications to Text Clustering

no code implementations26 Oct 2014 Apoorv Agarwal, Anna Choromanska, Krzysztof Choromanski

In this paper, we compare three initialization schemes for the KMEANS clustering algorithm: 1) random initialization (KMEANSRAND), 2) KMEANS++, and 3) KMEANSD++.

Clustering Point Processes +1

Logarithmic Time Online Multiclass prediction

no code implementations NeurIPS 2015 Anna Choromanska, John Langford

We develop top-down tree construction approaches for constructing logarithmic depth trees.

Stochastic Bound Majorization

no code implementations22 Sep 2013 Anna Choromanska, Tony Jebara

Recently a majorization method for optimizing partition functions of log-linear models was proposed alongside a novel quadratic variational upper-bound.

Stochastic Optimization

Semistochastic Quadratic Bound Methods

no code implementations5 Sep 2013 Aleksandr Y. Aravkin, Anna Choromanska, Tony Jebara, Dimitri Kanevsky

Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorably in comparison to state-of-the-art techniques.

Majorization for CRFs and Latent Likelihoods

no code implementations NeurIPS 2012 Tony Jebara, Anna Choromanska

The partition function plays a key role in probabilistic modeling including conditional random fields, graphical models, and maximum likelihood estimation.

Cannot find the paper you are looking for? You can Submit a new open access paper.