1 code implementation • 3 Oct 2024 • Małgorzata Łazuka, Andreea Anghel, Thomas Parnell
The performance of an LLM inference service is largely determined by the hardware onto which it is deployed, but understanding of which hardware will deliver on performance requirements remains challenging.
1 code implementation • 29 Apr 2024 • Davis Wertheimer, Joshua Rosenkranz, Thomas Parnell, Sahil Suneja, Pavithra Ranganathan, Raghu Ganti, Mudhakar Srivatsa
This technical report describes the design and training of novel speculative decoding draft models, for accelerating the inference speeds of large language models in a production environment.
no code implementations • 20 Apr 2022 • Małgorzata Łazuka, Thomas Parnell, Andreea Anghel, Haralampos Pozidis
Our experiments indicate that (a) many state-of-the-art cloud configuration solutions can be adapted to multi-cloud, with best results obtained for adaptations which utilize the hierarchical structure of the multi-cloud configuration domain, (b) hierarchical methods from AutoML can be used for the multi-cloud configuration task and can outperform state-of-the-art cloud configuration solutions and (c) CB achieves competitive or lower regret relative to other tested algorithms, whilst also identifying configurations that have 65% lower median cost and 20% lower median time in production, compared to choosing a random provider and configuration.
no code implementations • 29 Sep 2021 • Malgorzata Lazuka, Thomas Parnell, Andreea Anghel, Haralampos Pozidis
Multi-cloud computing has become increasingly popular with enterprises looking to avoid vendor lock-in.
no code implementations • 16 Nov 2020 • Thomas Schmied, Diego Didona, Andreas Döring, Thomas Parnell, Nikolas Ioannou
Machine learning (ML) methods have recently emerged as an effective way to perform automated parameter tuning of databases.
2 code implementations • NeurIPS 2020 • Thomas Parnell, Andreea Anghel, Malgorzata Lazuka, Nikolas Ioannou, Sebastian Kurella, Peshal Agarwal, Nikolaos Papandreou, Haralampos Pozidis
At each boosting iteration, their goal is to find the base hypothesis, selected from some base hypothesis class, that is closest to the Newton descent direction in a Euclidean sense.
no code implementations • 12 Jun 2020 • Georgios Damaskinos, Celestine Mendler-Dünner, Rachid Guerraoui, Nikolaos Papandreou, Thomas Parnell
In this paper we tackle the challenge of making the stochastic coordinate descent algorithm differentially private.
no code implementations • NeurIPS 2019 • Nikolas Ioannou, Celestine Mendler-Dünner, Thomas Parnell
In this paper we propose a novel parallel stochastic coordinate descent (SCD) algorithm with convergence guarantees that exhibits strong scalability.
no code implementations • 15 Oct 2019 • Andreea Anghel, Nikolas Ioannou, Thomas Parnell, Nikolaos Papandreou, Celestine Mendler-Dünner, Haris Pozidis
In this paper we analyze, evaluate, and improve the performance of training Random Forest (RF) models on modern CPU architectures.
no code implementations • 16 Sep 2019 • Dimitrios Sarigiannis, Thomas Parnell, Haris Pozidis
In this work, we propose a novel sampling distribution as an alternative to uniform sampling and prove theoretically that it has a better chance of finding the best configuration in a worst-case setting.
no code implementations • 16 Sep 2019 • Johanna Sommer, Dimitrios Sarigiannis, Thomas Parnell
In this short paper we investigate whether meta-learning techniques can be used to more effectively tune the hyperparameters of machine learning models using successive halving (SH).
no code implementations • 11 Sep 2019 • Michael Kaufmann, Kornilios Kourtis, Celestine Mendler-Dünner, Adrian Schüpbach, Thomas Parnell
To address this, we propose Chicle, a new elastic distributed training framework which exploits the nature of machine learning algorithms to implement elasticity and load balancing without micro-tasks.
no code implementations • 8 Jun 2019 • Martino Dazzi, Abu Sebastian, Pier Andrea Francese, Thomas Parnell, Luca Benini, Evangelos Eleftheriou
We show that this communication fabric facilitates the pipelined execution of all state of-the-art CNNs by proving the existence of a homomorphism between one graph representation of these networks and the proposed graph topology.
no code implementations • 22 Mar 2019 • Alessandro De Palma, Celestine Mendler-Dünner, Thomas Parnell, Andreea Anghel, Haralampos Pozidis
We present Acquisition Thompson Sampling (ATS), a novel technique for batch Bayesian Optimization (BO) based on the idea of sampling multiple acquisition functions from a stochastic process.
no code implementations • 6 Nov 2018 • Michael Kaufmann, Thomas Parnell, Kornilios Kourtis
In this paper we experimentally analyze the convergence behavior of CoCoA and show, that the number of workers required to achieve the highest convergence rate at any point in time, changes over the course of the training.
no code implementations • 5 Nov 2018 • Nikolas Ioannou, Celestine Dünner, Kornilios Kourtis, Thomas Parnell
The combined set of optimizations result in a consistent bottom line speedup in convergence of up to 12x compared to the initial asynchronous parallel training algorithm and up to 42x, compared to state of the art implementations (scikit-learn and h2o) on a range of multi-core CPU architectures.
no code implementations • 12 Sep 2018 • Andreea Anghel, Nikolaos Papandreou, Thomas Parnell, Alessandro De Palma, Haralampos Pozidis
Gradient boosting decision trees (GBDTs) have seen widespread adoption in academia, industry and competitive data science due to their state-of-the-art performance in many machine learning tasks.
no code implementations • NeurIPS 2018 • Celestine Dünner, Thomas Parnell, Dimitrios Sarigiannis, Nikolas Ioannou, Andreea Anghel, Gummadi Ravi, Madhusudanan Kandasamy, Haralampos Pozidis
We describe a new software framework for fast training of generalized linear models.
1 code implementation • NeurIPS 2017 • Celestine Dünner, Thomas Parnell, Martin Jaggi
We propose a generic algorithmic building block to accelerate training of machine learning models on heterogeneous compute systems.
no code implementations • 22 Feb 2017 • Thomas Parnell, Celestine Dünner, Kubilay Atasu, Manolis Sifalakis, Haris Pozidis
In this work we propose an accelerated stochastic learning system for very large-scale applications.
no code implementations • 5 Dec 2016 • Celestine Dünner, Thomas Parnell, Kubilay Atasu, Manolis Sifalakis, Haralampos Pozidis
We begin by analyzing the characteristics of a state-of-the-art distributed machine learning algorithm implemented in Spark and compare it to an equivalent reference implementation using the high performance computing framework MPI.
1 code implementation • 7 Apr 2016 • Reinhard Heckel, Michail Vlachos, Thomas Parnell, Celestine Dünner
We consider the problem of generating interpretable recommendations by identifying overlapping co-clusters of clients and products, based only on positive or implicit feedback.