Search Results for author: Edward Moroshko

Found 11 papers, 3 papers with code

How catastrophic can catastrophic forgetting be in linear regression?

no code implementations19 May 2022 Itay Evron, Edward Moroshko, Rachel Ward, Nati Srebro, Daniel Soudry

In specific settings, we highlight differences between forgetting and convergence to the offline solution as studied in those areas.

Continual Learning regression

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent

no code implementations19 Feb 2021 Shahar Azulay, Edward Moroshko, Mor Shpigel Nacson, Blake Woodworth, Nathan Srebro, Amir Globerson, Daniel Soudry

Recent work has highlighted the role of initialization scale in determining the structure of the solutions that gradient methods converge to.

Inductive Bias

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy

no code implementations NeurIPS 2020 Edward Moroshko, Suriya Gunasekar, Blake Woodworth, Jason D. Lee, Nathan Srebro, Daniel Soudry

We provide a detailed asymptotic study of gradient flow trajectories and their implicit optimization bias when minimizing the exponential loss over "diagonal linear networks".

General Classification

Kernel and Rich Regimes in Overparametrized Models

1 code implementation20 Feb 2020 Blake Woodworth, Suriya Gunasekar, Jason D. Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro

We provide a complete and detailed analysis for a family of simple depth-$D$ models that already exhibit an interesting and meaningful transition between the kernel and rich regimes, and we also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.

Finite Sample Analysis Of Dynamic Regression Parameter Learning

no code implementations13 Jun 2019 Mark Kozdoba, Edward Moroshko, Shie Mannor, Koby Crammer

The proposed bounds depend on the shape of a certain spectrum related to the system operator, and thus provide the first known explicit geometric parameter of the data that can be used to bound estimation errors.

regression

Kernel and Rich Regimes in Overparametrized Models

1 code implementation13 Jun 2019 Blake Woodworth, Suriya Gunasekar, Pedro Savarese, Edward Moroshko, Itay Golan, Jason Lee, Daniel Soudry, Nathan Srebro

A recent line of work studies overparametrized neural networks in the "kernel regime," i. e. when the network behaves during training as a kernelized linear predictor, and thus training with gradient descent has the effect of finding the minimum RKHS norm solution.

An Editorial Network for Enhanced Document Summarization

no code implementations WS 2019 Edward Moroshko, Guy Feigenblat, Haggai Roitman, David Konopnicki

We suggest a new idea of Editorial Network - a mixed extractive-abstractive summarization approach, which is applied as a post-processing step over a given sequence of extracted sentences.

Abstractive Text Summarization Document Summarization +1

Multi Instance Learning For Unbalanced Data

no code implementations17 Dec 2018 Mark Kozdoba, Edward Moroshko, Lior Shani, Takuya Takagi, Takashi Katoh, Shie Mannor, Koby Crammer

In the context of Multi Instance Learning, we analyze the Single Instance (SI) learning objective.

Efficient Loss-Based Decoding on Graphs For Extreme Classification

1 code implementation NeurIPS 2018 Itay Evron, Edward Moroshko, Koby Crammer

We build on a recent extreme classification framework with logarithmic time and space, and on a general approach for error correcting output coding (ECOC) with loss-based decoding, and introduce a flexible and efficient approach accompanied by theoretical bounds.

Classification General Classification

Selective Sampling with Drift

no code implementations17 Feb 2014 Edward Moroshko, Koby Crammer

Simulations on synthetic and real-world datasets demonstrate the superiority of our algorithms as a selective sampling algorithm in the drifting setting.

Active Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.