Search Results for author: Edward Moroshko

Found 11 papers, 3 papers with code

Continual Learning in Linear Classification on Separable Data

no code implementations • 6 Jun 2023 • Itay Evron, Edward Moroshko, Gon Buzaglo, Maroun Khriesh, Badea Marjieh, Nathan Srebro, Daniel Soudry

We analyze continual learning on a sequence of separable linear classification tasks with binary labels.

Classification Continual Learning +2

Paper
Add Code

How catastrophic can catastrophic forgetting be in linear regression?

no code implementations • 19 May 2022 • Itay Evron, Edward Moroshko, Rachel Ward, Nati Srebro, Daniel Soudry

In specific settings, we highlight differences between forgetting and convergence to the offline solution as studied in those areas.

Continual Learning regression

Paper
Add Code

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent

no code implementations • 19 Feb 2021 • Shahar Azulay, Edward Moroshko, Mor Shpigel Nacson, Blake Woodworth, Nathan Srebro, Amir Globerson, Daniel Soudry

Recent work has highlighted the role of initialization scale in determining the structure of the solutions that gradient methods converge to.

Inductive Bias

Paper
Add Code

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy

no code implementations • NeurIPS 2020 • Edward Moroshko, Suriya Gunasekar, Blake Woodworth, Jason D. Lee, Nathan Srebro, Daniel Soudry

We provide a detailed asymptotic study of gradient flow trajectories and their implicit optimization bias when minimizing the exponential loss over "diagonal linear networks".

General Classification

Paper
Add Code

Kernel and Rich Regimes in Overparametrized Models

1 code implementation • 20 Feb 2020 • Blake Woodworth, Suriya Gunasekar, Jason D. Lee, Edward Moroshko, Pedro Savarese, Itay Golan, Daniel Soudry, Nathan Srebro

We provide a complete and detailed analysis for a family of simple depth-$D$ models that already exhibit an interesting and meaningful transition between the kernel and rich regimes, and we also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.

Paper
Code

Finite Sample Analysis Of Dynamic Regression Parameter Learning

no code implementations • 13 Jun 2019 • Mark Kozdoba, Edward Moroshko, Shie Mannor, Koby Crammer

The proposed bounds depend on the shape of a certain spectrum related to the system operator, and thus provide the first known explicit geometric parameter of the data that can be used to bound estimation errors.

regression

Paper
Add Code

Kernel and Rich Regimes in Overparametrized Models

1 code implementation • 13 Jun 2019 • Blake Woodworth, Suriya Gunasekar, Pedro Savarese, Edward Moroshko, Itay Golan, Jason Lee, Daniel Soudry, Nathan Srebro

A recent line of work studies overparametrized neural networks in the "kernel regime," i. e. when the network behaves during training as a kernelized linear predictor, and thus training with gradient descent has the effect of finding the minimum RKHS norm solution.

Paper
Code

An Editorial Network for Enhanced Document Summarization

no code implementations • WS 2019 • Edward Moroshko, Guy Feigenblat, Haggai Roitman, David Konopnicki

We suggest a new idea of Editorial Network - a mixed extractive-abstractive summarization approach, which is applied as a post-processing step over a given sequence of extracted sentences.

Ranked #34 on Abstractive Text Summarization on CNN / Daily Mail

Abstractive Text Summarization Document Summarization +1

Paper
Add Code

Multi Instance Learning For Unbalanced Data

no code implementations • 17 Dec 2018 • Mark Kozdoba, Edward Moroshko, Lior Shani, Takuya Takagi, Takashi Katoh, Shie Mannor, Koby Crammer

In the context of Multi Instance Learning, we analyze the Single Instance (SI) learning objective.

Paper
Add Code

Efficient Loss-Based Decoding on Graphs For Extreme Classification

1 code implementation • NeurIPS 2018 • Itay Evron, Edward Moroshko, Koby Crammer

We build on a recent extreme classification framework with logarithmic time and space, and on a general approach for error correcting output coding (ECOC) with loss-based decoding, and introduce a flexible and efficient approach accompanied by theoretical bounds.

Classification General Classification

Paper
Code

Selective Sampling with Drift

no code implementations • 17 Feb 2014 • Edward Moroshko, Koby Crammer

Simulations on synthetic and real-world datasets demonstrate the superiority of our algorithms as a selective sampling algorithm in the drifting setting.

Active Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.