Search Results for author: Mikhail Khodak

Found 28 papers, 16 papers with code

A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

1 code implementation ACL 2018 Mikhail Khodak, Nikunj Saunshi, YIngyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features.

Document Classification Domain Adaptation +2

A Compressed Sensing View of Unsupervised Text Embeddings, Bag-of-n-Grams, and LSTMs

2 code implementations ICLR 2018 Sanjeev Arora, Mikhail Khodak, Nikunj Saunshi, Kiran Vodrahalli

We also show a surprising new property of embeddings such as GloVe and word2vec: they form a good sensing matrix for text that is more efficient than random matrices, the standard sparse recovery tool, which may explain why they lead to better representations in practice.

Cross-Modal Fine-Tuning: Align then Refine

1 code implementation11 Feb 2023 Junhong Shen, Liam Li, Lucio M. Dery, Corey Staten, Mikhail Khodak, Graham Neubig, Ameet Talwalkar

Fine-tuning large-scale pretrained models has led to tremendous progress in well-studied modalities such as vision and NLP.

AutoML

A Large Self-Annotated Corpus for Sarcasm

6 code implementations LREC 2018 Mikhail Khodak, Nikunj Saunshi, Kiran Vodrahalli

We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for sarcasm research and for training and evaluating systems for sarcasm detection.

Sarcasm Detection

Initialization and Regularization of Factorized Neural Layers

1 code implementation ICLR 2021 Mikhail Khodak, Neil Tenenholtz, Lester Mackey, Nicolò Fusi

In model compression, we show that they enable low-rank methods to significantly outperform both unstructured sparsity and tensor methods on the task of training low-memory residual networks; analogs of the schemes also improve the performance of tensor decomposition techniques.

Knowledge Distillation Model Compression +2

Geometry-Aware Gradient Algorithms for Neural Architecture Search

1 code implementation ICLR 2021 Liam Li, Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

Recent state-of-the-art methods for neural architecture search (NAS) exploit gradient-based optimization by relaxing the problem into continuous optimization over architectures and shared-weights, a noisy process that remains poorly understood.

Neural Architecture Search

Efficient Architecture Search for Diverse Tasks

1 code implementation15 Apr 2022 Junhong Shen, Mikhail Khodak, Ameet Talwalkar

While neural architecture search (NAS) has enabled automated machine learning (AutoML) for well-researched areas, its application to tasks beyond computer vision is still under-explored.

Neural Architecture Search Protein Folding

Adaptive Gradient-Based Meta-Learning Methods

1 code implementation NeurIPS 2019 Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

We build a theoretical framework for designing and understanding practical meta-learning methods that integrates sophisticated formalizations of task-similarity with the extensive literature on online convex optimization and sequential prediction algorithms.

Federated Learning Few-Shot Learning

Automated WordNet Construction Using Word Embeddings

1 code implementation WS 2017 Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora

To evaluate our method we construct two 600-word testsets for word-to-synset matching in French and Russian using native speakers and evaluate the performance of our method along with several other recent approaches.

Information Retrieval Machine Translation +3

Provable Guarantees for Gradient-Based Meta-Learning

1 code implementation27 Feb 2019 Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

We study the problem of meta-learning through the lens of online convex optimization, developing a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods.

Generalization Bounds Meta-Learning

On Noisy Evaluation in Federated Hyperparameter Tuning

1 code implementation17 Dec 2022 Kevin Kuo, Pratiksha Thaker, Mikhail Khodak, John Nguyen, Daniel Jiang, Ameet Talwalkar, Virginia Smith

In this work, we perform the first systematic study on the effect of noisy evaluation in federated hyperparameter tuning.

Federated Learning

AANG: Automating Auxiliary Learning

2 code implementations27 May 2022 Lucio M. Dery, Paul Michel, Mikhail Khodak, Graham Neubig, Ameet Talwalkar

Auxiliary objectives, supplementary learning signals that are introduced to help aid learning on data-starved or highly complex end-tasks, are commonplace in machine learning.

Auxiliary Learning

Extending and Improving Wordnet via Unsupervised Word Embeddings

no code implementations29 Apr 2017 Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora

Our methods require very few linguistic resources, thus being applicable for Wordnet construction in low-resources languages, and may further be applied to sense clustering and other Wordnet improvements.

Clustering Word Embeddings

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

no code implementations25 Feb 2019 Sanjeev Arora, Hrishikesh Khandeparkar, Mikhail Khodak, Orestis Plevrakis, Nikunj Saunshi

This framework allows us to show provable guarantees on the performance of the learned representations on the average classification task that is comprised of a subset of the same set of latent classes.

Contrastive Learning General Classification +1

Differentially Private Meta-Learning

no code implementations ICLR 2020 Jeffrey Li, Mikhail Khodak, Sebastian Caldas, Ameet Talwalkar

Parameter-transfer is a well-known and versatile approach for meta-learning, with applications including few-shot learning, federated learning, and reinforcement learning.

Federated Learning Few-Shot Learning +4

A Sample Complexity Separation between Non-Convex and Convex Meta-Learning

no code implementations ICML 2020 Nikunj Saunshi, Yi Zhang, Mikhail Khodak, Sanjeev Arora

In contrast, for the non-convex formulation of a two layer linear network on the same instance, we show that both Reptile and multi-task representation learning can have new task sample complexity of $\mathcal{O}(1)$, demonstrating a separation from convex meta-learning.

Meta-Learning Representation Learning

Searching for Convolutions and a More Ambitious NAS

no code implementations1 Jan 2021 Nicholas Carl Roberts, Mikhail Khodak, Tri Dao, Liam Li, Nina Balcan, Christopher Re, Ameet Talwalkar

An important goal of neural architecture search (NAS) is to automate-away the design of neural networks on new tasks in under-explored domains, thus helping to democratize machine learning.

Neural Architecture Search

Learning-to-learn non-convex piecewise-Lipschitz functions

no code implementations NeurIPS 2021 Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar

We analyze the meta-learning of the initialization and step-size of learning algorithms for piecewise-Lipschitz functions, a non-convex setting with applications to both machine learning and algorithms.

Meta-Learning

On Weight-Sharing and Bilevel Optimization in Architecture Search

no code implementations25 Sep 2019 Mikhail Khodak, Liam Li, Maria-Florina Balcan, Ameet Talwalkar

Weight-sharing—the simultaneous optimization of multiple neural networks using the same parameters—has emerged as a key component of state-of-the-art neural architecture search.

Bilevel Optimization feature selection +1

Learning Predictions for Algorithms with Predictions

no code implementations18 Feb 2022 Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar, Sergei Vassilvitskii

A burgeoning paradigm in algorithm design is the field of algorithms with predictions, in which algorithms can take advantage of a possibly-imperfect prediction of some aspect of the problem.

Scheduling

Meta-Learning Adversarial Bandits

no code implementations27 May 2022 Maria-Florina Balcan, Keegan Harris, Mikhail Khodak, Zhiwei Steven Wu

We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure.

Meta-Learning Multi-Armed Bandits

Provably tuning the ElasticNet across instances

no code implementations20 Jul 2022 Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar

We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization.

Hyperparameter Optimization regression

Learning-Augmented Private Algorithms for Multiple Quantile Release

1 code implementation20 Oct 2022 Mikhail Khodak, Kareem Amin, Travis Dick, Sergei Vassilvitskii

When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors.

Privacy Preserving

Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances

no code implementations3 Oct 2023 Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar

For this method, we prove that a bandit online learning algorithm -- using only the number of iterations as feedback -- can select parameters for a sequence of instances such that the overall cost approaches that of the best fixed $\omega$ as the sequence length increases.

Cannot find the paper you are looking for? You can Submit a new open access paper.