Search Results for author: Hanie Sedghi

Found 24 papers, 6 papers with code

Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

no code implementations ICLR 2022 Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur, Hanie Sedghi

Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops.

The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks

2 code implementations ICLR 2022 Rahim Entezari, Hanie Sedghi, Olga Saukh, Behnam Neyshabur

In this paper, we conjecture that if the permutation invariance of neural networks is taken into account, SGD solutions will likely have no barrier in the linear interpolation between them.

Exploring the Limits of Large Scale Pre-training

no code implementations ICLR 2022 Samira Abnar, Mostafa Dehghani, Behnam Neyshabur, Hanie Sedghi

Recent developments in large-scale machine learning suggest that by scaling up data, model size and training time properly, one might observe that improvements in pre-training would transfer favorably to most downstream tasks.

Gradual Domain Adaptation in the Wild: When Intermediate Distributions are Absent

no code implementations29 Sep 2021 Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi

It is shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution; self-training can be successfully applied on gradually shifted samples to adapt the model toward the target distribution.

Domain Adaptation

Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent

1 code implementation10 Jun 2021 Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi

It has been shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution, self-training can be successfully applied on gradually shifted samples to adapt the model toward the target distribution.

Domain Adaptation

What is being transferred in transfer learning?

1 code implementation NeurIPS 2020 Behnam Neyshabur, Hanie Sedghi, Chiyuan Zhang

One desired capability for machines is the ability to transfer their knowledge of one domain to another where data is (usually) scarce.

Transfer Learning

On the effect of the activation function on the distribution of hidden nodes in a deep network

no code implementations7 Jan 2019 Philip M. Long, Hanie Sedghi

We analyze the joint probability distribution on the lengths of the vectors of hidden variables in different layers of a fully connected deep network, when the weights and biases are chosen randomly according to Gaussian distributions, and the input is in $\{ -1, 1\}^N$.

The Singular Values of Convolutional Layers

1 code implementation ICLR 2019 Hanie Sedghi, Vineet Gupta, Philip M. Long

We characterize the singular values of the linear transformation associated with a standard 2D multi-channel convolutional layer, enabling their efficient computation.

Knowledge Completion for Generics using Guided Tensor Factorization

no code implementations TACL 2018 Hanie Sedghi, Ashish Sabharwal

Given a knowledge base or KB containing (noisy) facts about common nouns or generics, such as "all trees produce oxygen" or "some animals live in forests", we consider the problem of inferring additional such facts at a precision similar to that of the starting KB.

Active Learning Question Answering

Training Input-Output Recurrent Neural Networks through Spectral Methods

no code implementations3 Mar 2016 Hanie Sedghi, Anima Anandkumar

We consider the problem of training input-output recurrent neural networks (RNN) for sequence labeling tasks.

POS

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

no code implementations28 Jun 2015 Majid Janzamin, Hanie Sedghi, Anima Anandkumar

We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks.

Tensor Decomposition

Score Function Features for Discriminative Learning

no code implementations19 Dec 2014 Majid Janzamin, Hanie Sedghi, Anima Anandkumar

In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples.

Provable Tensor Methods for Learning Mixtures of Generalized Linear Models

no code implementations9 Dec 2014 Hanie Sedghi, Majid Janzamin, Anima Anandkumar

In contrast, we present a tensor decomposition method which is guaranteed to correctly recover the parameters.

General Classification Tensor Decomposition

Score Function Features for Discriminative Learning: Matrix and Tensor Framework

no code implementations9 Dec 2014 Majid Janzamin, Hanie Sedghi, Anima Anandkumar

In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples.

Provable Methods for Training Neural Networks with Sparse Connectivity

no code implementations8 Dec 2014 Hanie Sedghi, Anima Anandkumar

We provide novel guaranteed approaches for training feedforward neural networks with sparse connectivity.

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition

no code implementations NeurIPS 2014 Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

We first analyze the simple setting, where the optimization problem consists of a loss function and a single regularizer (e. g. sparse optimization), and then extend to the multi-block setting with multiple regularizers and multiple variables (e. g. matrix decomposition into sparse and low rank components).

Statistical Structure Learning, Towards a Robust Smart Grid

no code implementations7 Mar 2014 Hanie Sedghi, Edmond Jonckheere

We propose a decentralized false data injection detection scheme based on Markov graph of the bus phase angles.

Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Noisy Matrix Decomposition

2 code implementations NeurIPS 2014 Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere

For sparse optimization, we establish that the modified ADMM method has an optimal convergence rate of $\mathcal{O}(s\log d/T)$, where $s$ is the sparsity level, $d$ is the data dimension and $T$ is the number of steps.

Cannot find the paper you are looking for? You can Submit a new open access paper.