Search Results for author: Josif Grabocka

Found 37 papers, 19 papers with code

Multi-objective Differentiable Neural Architecture Search

1 code implementation28 Feb 2024 Rhea Sanjay Sukthanker, Arber Zela, Benedikt Staffler, Samuel Dooley, Josif Grabocka, Frank Hutter

Pareto front profiling in multi-objective optimization (MOO), i. e. finding a diverse set of Pareto optimal solutions, is challenging, especially with expensive objectives like neural network training.

Machine Translation Neural Architecture Search

Hierarchical Transformers are Efficient Meta-Reinforcement Learners

no code implementations9 Feb 2024 Gresa Shala, André Biedenkapp, Josif Grabocka

We introduce Hierarchical Transformers for Meta-Reinforcement Learning (HTrMRL), a powerful online meta-reinforcement learning approach.

Meta Reinforcement Learning reinforcement-learning

Tabular Data: Is Attention All You Need?

no code implementations6 Feb 2024 Guri Zabërgja, Arlind Kadra, Josif Grabocka

In this paper, we introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on tabular data, but also transformer-based architectures against traditional multi-layer perceptrons (MLP) with residual connections.

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

1 code implementation6 Jun 2023 Sebastian Pineda Arango, Fabio Ferreira, Arlind Kadra, Frank Hutter, Josif Grabocka

With the ever-increasing number of pretrained models, machine learning practitioners are continuously faced with which pretrained model to use, and how to finetune it for a new dataset.

Hyperparameter Optimization Image Classification

Deep Pipeline Embeddings for AutoML

1 code implementation23 May 2023 Sebastian Pineda Arango, Josif Grabocka

As a remedy, this paper proposes a novel neural architecture that captures the deep interaction between the components of a Machine Learning pipeline.

Automatic Machine Learning Model Selection Bayesian Optimization +2

Breaking the Paradox of Explainable Deep Learning

1 code implementation22 May 2023 Arlind Kadra, Sebastian Pineda Arango, Josif Grabocka

Through extensive experiments, we demonstrate that our explainable deep networks are as accurate as state-of-the-art classifiers on tabular data.

Phantom Embeddings: Using Embedding Space for Model Regularization in Deep Neural Networks

no code implementations14 Apr 2023 Mofassir ul Islam Arif, Mohsan Jameel, Josif Grabocka, Lars Schmidt-Thieme

We create phantom embeddings from a subset of homogenous samples and use these phantom embeddings to decrease the inter-class similarity of instances in their latent embedding space.

Image Classification

Deep Ranking Ensembles for Hyperparameter Optimization

1 code implementation27 Mar 2023 Abdus Salam Khazi, Sebastian Pineda Arango, Josif Grabocka

Automatically optimizing the hyperparameters of Machine Learning algorithms is one of the primary open questions in AI.

Hyperparameter Optimization Learning-To-Rank

Zero-Shot AutoML with Pretrained Models

1 code implementation16 Jun 2022 Ekrem Öztürk, Fabio Ferreira, Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka, Frank Hutter

Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small?

AutoML Meta-Learning

Supervising the Multi-Fidelity Race of Hyperparameter Configurations

1 code implementation20 Feb 2022 Martin Wistuba, Arlind Kadra, Josif Grabocka

Multi-fidelity (gray-box) hyperparameter optimization techniques (HPO) have recently emerged as a promising direction for tuning Deep Learning methods.

Bayesian Optimization Gaussian Processes +1

Transformers Can Do Bayesian Inference

1 code implementation ICLR 2022 Samuel Müller, Noah Hollmann, Sebastian Pineda Arango, Josif Grabocka, Frank Hutter

Our method restates the objective of posterior approximation as a supervised classification problem with a set-valued input: it repeatedly draws a task (or function) from the prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points.

AutoML Bayesian Inference +2

Multi-task problems are not multi-objective

1 code implementation14 Oct 2021 Michael Ruchte, Josif Grabocka

These works also use Multi-Task Learning (MTL) problems to benchmark MOO algorithms treating each task as independent objective.

BIG-bench Machine Learning Multi-Task Learning

Transfer Learning for Bayesian HPO with End-to-End Meta-Features

no code implementations29 Sep 2021 Hadi Samer Jomaa, Sebastian Pineda Arango, Lars Schmidt-Thieme, Josif Grabocka

As a result, our novel DKLM can learn contextualized dataset-specific similarity representations for hyperparameter configurations.

Hyperparameter Optimization Transfer Learning

Well-tuned Simple Nets Excel on Tabular Datasets

1 code implementation NeurIPS 2021 Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures.

HPO-B: A Large-Scale Reproducible Benchmark for Black-Box HPO based on OpenML

1 code implementation11 Jun 2021 Sebastian Pineda Arango, Hadi S. Jomaa, Martin Wistuba, Josif Grabocka

Hyperparameter optimization (HPO) is a core problem for the machine learning community and remains largely unsolved due to the significant computational resources required to evaluate hyperparameter configurations.

Hyperparameter Optimization Transfer Learning

Scalable Pareto Front Approximation for Deep Multi-Objective Learning

1 code implementation24 Mar 2021 Michael Ruchte, Josif Grabocka

Prior work either demand optimizing a new network for every point on the Pareto front, or induce a large overhead to the number of trainable parameters by using hyper-networks conditioned on modifiable preferences.

Hyperparameter Optimization with Differentiable Metafeatures

no code implementations7 Feb 2021 Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka

In contrast to existing models, DMFBS i) integrates a differentiable metafeature extractor and ii) is optimized using a novel multi-task loss, linking manifold regularization with a dataset similarity measure learned via an auxiliary dataset identification meta-task, effectively enforcing the response approximation for similar datasets to be similar.

Hyperparameter Optimization

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

1 code implementation ICLR 2021 Martin Wistuba, Josif Grabocka

Hyperparameter optimization (HPO) is a central pillar in the automation of machine learning solutions and is mainly performed via Bayesian optimization, where a parametric surrogate is learned to approximate the black box response function (e. g. validation error).

Bayesian Optimization Few-Shot Learning +2

Zero-shot Transfer Learning for Gray-box Hyper-parameter Optimization

no code implementations1 Jan 2021 Hadi Samer Jomaa, Lars Schmidt-Thieme, Josif Grabocka

Zero-shot hyper-parameter optimization refers to the process of selecting hyper- parameter configurations that are expected to perform well for a given dataset upfront, without access to any observations of the losses of the target response.

Bayesian Optimization Transfer Learning

Regularization Cocktails

no code implementations1 Jan 2021 Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

The regularization of prediction models is arguably the most crucial ingredient that allows Machine Learning solutions to generalize well on unseen data.

Hyperparameter Optimization

NASLib: A Modular and Flexible Neural Architecture Search Library

1 code implementation1 Jan 2021 Michael Ruchte, Arber Zela, Julien Niklas Siems, Josif Grabocka, Frank Hutter

Neural Architecture Search (NAS) is one of the focal points for the Deep Learning community, but reproducing NAS methods is extremely challenging due to numerous low-level implementation details.

Neural Architecture Search

HIDRA: Head Initialization across Dynamic targets for Robust Architectures

1 code implementation28 Oct 2019 Rafael Rego Drumond, Lukas Brinkmeyer, Josif Grabocka, Lars Schmidt-Thieme

In this paper, we present HIDRA, a meta-learning approach that enables training and evaluating across tasks with any number of target variables.

Meta-Learning

Chameleon: Learning Model Initializations Across Tasks With Different Schemas

1 code implementation30 Sep 2019 Lukas Brinkmeyer, Rafael Rego Drumond, Randolf Scholz, Josif Grabocka, Lars Schmidt-Thieme

Parametric models, and particularly neural networks, require weight initialization as a starting point for gradient-based optimization.

Meta-Learning

Atomic Compression Networks

no code implementations25 Sep 2019 Jonas Falkner, Josif Grabocka, Lars Schmidt-Thieme

Compressed forms of deep neural networks are essential in deploying large-scale computational models on resource-constrained devices.

Model Compression

Hyp-RL : Hyperparameter Optimization by Reinforcement Learning

1 code implementation27 Jun 2019 Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme

More recently, methods have been introduced that build a so-called surrogate model that predicts the validation loss for a specific hyperparameter setting, model and dataset and then sequentially select the next hyperparameter to test, based on a heuristic function of the expected value and the uncertainty of the surrogate model called acquisition function (sequential model-based Bayesian optimization, SMBO).

Bayesian Optimization Hyperparameter Optimization +2

In Hindsight: A Smooth Reward for Steady Exploration

no code implementations24 Jun 2019 Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme

In classical Q-learning, the objective is to maximize the sum of discounted rewards through iteratively using the Bellman equation as an update, in an attempt to estimate the action value function of the optimal policy.

Atari Games Q-Learning

Dataset2Vec: Learning Dataset Meta-Features

1 code implementation27 May 2019 Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka

As a data-driven approach, meta-learning requires meta-features that represent the primary learning tasks or datasets, and are estimated traditonally as engineered dataset statistics that require expert domain knowledge tailored for every meta-task.

Auxiliary Learning Few-Shot Learning +1

Learning Surrogate Losses

no code implementations24 May 2019 Josif Grabocka, Randolf Scholz, Lars Schmidt-Thieme

Ultimately, the surrogate losses are learned jointly with the prediction model via bilevel optimization.

Bilevel Optimization General Classification

Multi-Label Network Classification via Weighted Personalized Factorizations

no code implementations25 Feb 2019 Ahmed Rashed, Josif Grabocka, Lars Schmidt-Thieme

It can be formalized as a multi-relational learning task for predicting nodes labels based on their relations within the network.

Classification General Classification +2

NeuralWarp: Time-Series Similarity with Warping Networks

2 code implementations20 Dec 2018 Josif Grabocka, Lars Schmidt-Thieme

Research on time-series similarity measures has emphasized the need for elastic methods which align the indices of pairs of time series and a plethora of non-parametric have been proposed for the task.

Sentence Sentence Similarity +2

Channel masking for multivariate time series shapelets

no code implementations2 Nov 2017 Dripta S. Raychaudhuri, Josif Grabocka, Lars Schmidt-Thieme

Time series shapelets are discriminative sub-sequences and their similarity to time series can be used for time series classification.

General Classification Time Series +2

Optimal Time-Series Motifs

no code implementations3 May 2015 Josif Grabocka, Nicolas Schilling, Lars Schmidt-Thieme

We demonstrate that searching is non-optimal since the domain of motifs is restricted, and instead we propose a principled optimization approach able to find optimal motifs.

Time Series Time Series Analysis

Ultra-Fast Shapelets for Time Series Classification

no code implementations17 Mar 2015 Martin Wistuba, Josif Grabocka, Lars Schmidt-Thieme

A method for using shapelets for multivariate time series is proposed and Ultra-Fast Shapelets is proven to be successful in comparison to state-of-the-art multivariate time series classifiers on 15 multivariate time series datasets from various domains.

Classification General Classification +3

Scalable Discovery of Time-Series Shapelets

no code implementations11 Mar 2015 Josif Grabocka, Martin Wistuba, Lars Schmidt-Thieme

Time-series classification is an important problem for the data mining community due to the wide range of application domains involving time-series data.

Clustering General Classification +4

Invariant Factorization Of Time-Series

no code implementations23 Dec 2013 Josif Grabocka, Lars Schmidt-Thieme

Time-series classification is an important domain of machine learning and a plethora of methods have been developed for the task.

Time Series Time Series Analysis +1

Time-Series Classification Through Histograms of Symbolic Polynomials

no code implementations24 Jul 2013 Josif Grabocka, Martin Wistuba, Lars Schmidt-Thieme

The coefficients of the polynomial functions are converted to symbolic words via equivolume discretizations of the coefficients' distributions.

Classification Econometrics +4

Cannot find the paper you are looking for? You can Submit a new open access paper.