Search Results for author: Cedric Renggli

Found 16 papers, 8 papers with code

Co-design Hardware and Algorithm for Vector Search

1 code implementation • 19 Jun 2023 • Wenqi Jiang, Shigang Li, Yu Zhu, Johannes De Fine Licht, Zhenhao He, Runbin Shi, Cedric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso

Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents.

Information Retrieval Retrieval

Paper
Code

Stochastic Gradient Descent without Full Data Shuffle

1 code implementation • 12 Jun 2022 • Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang

In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate.

Computational Efficiency

Paper
Code

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning

1 code implementation • 4 Apr 2022 • Cedric Renggli, Xiaozhe Yao, Luka Kolar, Luka Rimanic, Ana Klimovic, Ce Zhang

Transfer learning can be seen as a data- and compute-efficient alternative to training models from scratch.

Transfer Learning

Paper
Code

Learning to Merge Tokens in Vision Transformers

1 code implementation • 24 Feb 2022 • Cedric Renggli, André Susano Pinto, Neil Houlsby, Basil Mustafa, Joan Puigcerver, Carlos Riquelme

Transformers are widely applied to solve natural language understanding and computer vision tasks.

Natural Language Understanding

Paper
Code

Dynamic Human Evaluation for Relative Model Comparisons

1 code implementation • LREC 2022 • Thórhildur Thorleiksdóttir, Cedric Renggli, Nora Hollenstein, Ce Zhang

Collecting human judgements is currently the most reliable evaluation method for natural language generation systems.

Text Generation

Paper
Code

Evaluating Bayes Error Estimators on Real-World Datasets with FeeBee

1 code implementation • 30 Aug 2021 • Cedric Renggli, Luka Rimanic, Nora Hollenstein, Ce Zhang

The Bayes error rate (BER) is a fundamental concept in machine learning that quantifies the best possible accuracy any classifier can achieve on a fixed probability distribution.

Paper
Code

Decoding EEG Brain Activity for Multi-Modal Natural Language Processing

no code implementations • 17 Feb 2021 • Nora Hollenstein, Cedric Renggli, Benjamin Glaus, Maria Barrett, Marius Troendle, Nicolas Langer, Ce Zhang

In this paper, we present the first large-scale study of systematically analyzing the potential of EEG brain activity data for improving natural language processing tasks, with a special focus on which features of the signal are most beneficial.

BIG-bench Machine Learning EEG +2

Paper
Add Code

A Data Quality-Driven View of MLOps

no code implementations • 15 Feb 2021 • Cedric Renggli, Luka Rimanic, Nezihe Merve Gürel, Bojan Karlaš, Wentao Wu, Ce Zhang

Developing machine learning models can be seen as a process similar to the one established for traditional software development.

BIG-bench Machine Learning

Paper
Add Code

Automatic Feasibility Study via Data Quality Analysis for ML: A Case-Study on Label Noise

2 code implementations • 16 Oct 2020 • Cedric Renggli, Luka Rimanic, Luka Kolar, Wentao Wu, Ce Zhang

In our experience of working with domain experts who are using today's AutoML systems, a common problem we encountered is what we call "unrealistic expectations" -- when users are facing a very challenging task with a noisy data acquisition process, while being expected to achieve startlingly high accuracy with machine learning (ML).

AutoML BIG-bench Machine Learning

Paper
Code

On Convergence of Nearest Neighbor Classifiers over Feature Transformations

no code implementations • NeurIPS 2020 • Luka Rimanic, Cedric Renggli, Bo Li, Ce Zhang

This analysis requires in-depth understanding of the properties that connect both the transformed space and the raw feature space.

Paper
Add Code

Which Model to Transfer? Finding the Needle in the Growing Haystack

no code implementations • CVPR 2022 • Cedric Renggli, André Susano Pinto, Luka Rimanic, Joan Puigcerver, Carlos Riquelme, Ce Zhang, Mario Lucic

Transfer learning has been recently popularized as a data-efficient alternative to training models from scratch, in particular for computer vision tasks where it provides a remarkably solid baseline.

Transfer Learning

Paper
Add Code

Scalable Transfer Learning with Expert Models

no code implementations • ICLR 2021 • Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

We explore the use of expert representations for transfer with a simple, yet effective, strategy.

Ranked #11 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

Observer Dependent Lossy Image Compression

1 code implementation • 8 Oct 2019 • Maurice Weber, Cedric Renggli, Helmut Grabner, Ce Zhang

To that end, we use a family of loss functions that allows to optimize deep image compression depending on the observer and to interpolate between human perceived visual quality and classification accuracy, enabling a more unified view on image compression.

Classification General Classification +4

Paper
Code

Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment

no code implementations • 1 Mar 2019 • Cedric Renggli, Bojan Karlaš, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, Ce Zhang

Continuous integration is an indispensable step of modern software engineering practices to systematically manage the life cycles of system development.

2k BIG-bench Machine Learning

Paper
Add Code

Distributed Learning over Unreliable Networks

no code implementations • 17 Oct 2018 • Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, Ji Liu

Most of today's distributed machine learning systems assume {\em reliable networks}: whenever two machines exchange information (e. g., gradients or models), the network should guarantee the delivery of the message.

BIG-bench Machine Learning

Paper
Add Code

SparCML: High-Performance Sparse Communication for Machine Learning

no code implementations • 22 Feb 2018 • Cedric Renggli, Saleh Ashkboos, Mehdi Aghagolzadeh, Dan Alistarh, Torsten Hoefler

This allreduce is the single communication and thus scalability bottleneck for most machine learning workloads.

BIG-bench Machine Learning Blocking +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.