Search Results for author: Harish G. Ramaswamy

Found 16 papers, 6 papers with code

On the Learning Dynamics of Attention Networks

1 code implementation • 25 Jul 2023 • Rahul Vashisht, Harish G. Ramaswamy

Attention models are typically learned by optimizing one of three standard loss functions that are variously called -- soft attention, hard attention, and latent variable marginal likelihood (LVML) attention.

Hard Attention

Paper
Code

On the Interpretability of Attention Networks

1 code implementation • 30 Dec 2022 • Lakshmi Narayan Pandey, Rahul Vashisht, Harish G. Ramaswamy

In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network.

Image Captioning

Paper
Code

Consistent Multiclass Algorithms for Complex Metrics and Constraints

1 code implementation • 18 Oct 2022 • Harikrishna Narasimhan, Harish G. Ramaswamy, Shiv Kumar Tavker, Drona Khurana, Praneeth Netrapalli, Shivani Agarwal

We present consistent algorithms for multiclass learning with complex performance metrics and constraints, where the objective and constraints are defined by arbitrary functions of the confusion matrix.

Fairness

Paper
Code

Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

no code implementations • 25 Nov 2021 • Umangi Jain, Harish G. Ramaswamy

Despite their massive success, training successful deep neural networks still largely relies on experimentally choosing an architecture, hyper-parameters, initialization, and training mechanism.

Paper
Add Code

Using noise resilience for ranking generalization of deep neural networks

1 code implementation • 16 Dec 2020 • Depen Morwani, Rahul Vashisht, Harish G. Ramaswamy

Recent papers have shown that sufficiently overparameterized neural networks can perfectly fit even random labels.

Position

Paper
Code

Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets

1 code implementation • 24 Oct 2020 • Depen Morwani, Harish G. Ramaswamy

We analyse both standard weight normalization (SWN) and exponential weight normalization (EWN), and show that the gradient flow path with EWN is equivalent to gradient flow on standard networks with an adaptive learning rate.

Inductive Bias

Paper
Code

Convex Calibrated Surrogates for the Multi-Label F-Measure

no code implementations • ICML 2020 • Mingyuan Zhang, Harish G. Ramaswamy, Shivani Agarwal

In particular, the F-measure explicitly balances recall (fraction of active labels predicted to be active) and precision (fraction of labels predicted to be active that are actually so), both of which are important in evaluating the overall performance of a multi-label classifier.

Multi-Label Classification

Paper
Add Code

Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization

3 code implementations • WACV 2020 • Saurabh Desai, Harish G. Ramaswamy

In response to recent criticism of gradient-based visualization techniques, we propose a new methodology to generate visual explanations for deep Convolutional Neural Networks (CNN) - based models.

9,389

Paper
Code

On Knowledge distillation from complex networks for response prediction

no code implementations • NAACL 2019 • Siddhartha Arora, Mitesh M. Khapra, Harish G. Ramaswamy

In order to overcome this, we use standard simple models which do not capture all pairwise interactions, but learn to emulate certain characteristics of a complex teacher network.

Knowledge Distillation Question Answering

Paper
Add Code

On Controllable Sparse Alternatives to Softmax

no code implementations • NeurIPS 2018 • Anirban Laha, Saneem A. Chemmengath, Priyanka Agrawal, Mitesh M. Khapra, Karthik Sankaranarayanan, Harish G. Ramaswamy

Converting an n-dimensional vector to a probability distribution over n objects is a commonly used component in many machine learning tasks like multiclass classification, multilabel classification, attention mechanisms etc.

Abstractive Text Summarization Classification +3

Paper
Add Code

Mixture Proportion Estimation via Kernel Embedding of Distributions

no code implementations • 8 Mar 2016 • Harish G. Ramaswamy, Clayton Scott, Ambuj Tewari

Mixture proportion estimation (MPE) is the problem of estimating the weight of a component distribution in a mixture, given samples from the mixture and component.

Anomaly Detection Weakly-supervised Learning

Paper
Add Code

Consistent Algorithms for Multiclass Classification with a Reject Option

no code implementations • 15 May 2015 • Harish G. Ramaswamy, Ambuj Tewari, Shivani Agarwal

We consider the problem of $n$-class classification ($n\geq 2$), where the classifier can choose to abstain from making predictions at a given cost, say, a factor $\alpha$ of the cost of misclassification.

Classification General Classification

Paper
Add Code

Consistent Classification Algorithms for Multi-class Non-Decomposable Performance Metrics

no code implementations • 1 Jan 2015 • Harish G. Ramaswamy, Harikrishna Narasimhan, Shivani Agarwal

In this paper, we provide a unified framework for analysing a multi-class non-decomposable performance metric, where the problem of finding the optimal classifier for the performance metric is viewed as an optimization problem over the space of all confusion matrices achievable under the given distribution.

Classification General Classification +2

Paper
Add Code

Convex Calibration Dimension for Multiclass Loss Matrices

no code implementations • 12 Aug 2014 • Harish G. Ramaswamy, Shivani Agarwal

We extend the notion of classification calibration, which has been studied for binary and multiclass 0-1 classification problems (and for certain other specific learning problems), to the general multiclass setting, and derive necessary and sufficient conditions for a surrogate loss to be calibrated with respect to a loss matrix in this setting.

General Classification

Paper
Add Code

Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses

no code implementations • NeurIPS 2013 • Harish G. Ramaswamy, Shivani Agarwal, Ambuj Tewari

The design of convex, calibrated surrogate losses, whose minimization entails consistency with respect to a desired target loss, is an important concept to have emerged in the theory of machine learning in recent years.

Paper
Add Code

Classification Calibration Dimension for General Multiclass Losses

no code implementations • NeurIPS 2012 • Harish G. Ramaswamy, Shivani Agarwal

We extend the notion of classification calibration, which has been studied for binary and multiclass 0-1 classification problems (and for certain other specific learning problems), to the general multiclass setting, and derive necessary and sufficient conditions for a surrogate loss to be classification calibrated with respect to a loss matrix in this setting.

Classification General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.