no code implementations • 1 Nov 2024 • Jason Mohoney, Anil Pacaci, Shihabur Rahman Chowdhury, Umar Farooq Minhas, Jeffery Pound, Cedric Renggli, Nima Reyhani, Ihab F. Ilyas, Theodoros Rekatsinas, Shivaram Venkataraman
The prevalence of vector similarity search in modern machine learning applications and the continuously changing nature of data processed by these applications necessitate efficient and effective index maintenance techniques for vector search indexes.
1 code implementation • 19 Jun 2023 • Wenqi Jiang, Shigang Li, Yu Zhu, Johannes De Fine Licht, Zhenhao He, Runbin Shi, Cedric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso
Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents.
1 code implementation • 12 Jun 2022 • Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang
In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate.
1 code implementation • 4 Apr 2022 • Cedric Renggli, Xiaozhe Yao, Luka Kolar, Luka Rimanic, Ana Klimovic, Ce Zhang
Transfer learning can be seen as a data- and compute-efficient alternative to training models from scratch.
1 code implementation • 24 Feb 2022 • Cedric Renggli, André Susano Pinto, Neil Houlsby, Basil Mustafa, Joan Puigcerver, Carlos Riquelme
Transformers are widely applied to solve natural language understanding and computer vision tasks.
1 code implementation • LREC 2022 • Thórhildur Thorleiksdóttir, Cedric Renggli, Nora Hollenstein, Ce Zhang
Collecting human judgements is currently the most reliable evaluation method for natural language generation systems.
1 code implementation • 30 Aug 2021 • Cedric Renggli, Luka Rimanic, Nora Hollenstein, Ce Zhang
The Bayes error rate (BER) is a fundamental concept in machine learning that quantifies the best possible accuracy any classifier can achieve on a fixed probability distribution.
no code implementations • 17 Feb 2021 • Nora Hollenstein, Cedric Renggli, Benjamin Glaus, Maria Barrett, Marius Troendle, Nicolas Langer, Ce Zhang
In this paper, we present the first large-scale study of systematically analyzing the potential of EEG brain activity data for improving natural language processing tasks, with a special focus on which features of the signal are most beneficial.
no code implementations • 15 Feb 2021 • Cedric Renggli, Luka Rimanic, Nezihe Merve Gürel, Bojan Karlaš, Wentao Wu, Ce Zhang
Developing machine learning models can be seen as a process similar to the one established for traditional software development.
2 code implementations • 16 Oct 2020 • Cedric Renggli, Luka Rimanic, Luka Kolar, Wentao Wu, Ce Zhang
In our experience of working with domain experts who are using today's AutoML systems, a common problem we encountered is what we call "unrealistic expectations" -- when users are facing a very challenging task with a noisy data acquisition process, while being expected to achieve startlingly high accuracy with machine learning (ML).
no code implementations • NeurIPS 2020 • Luka Rimanic, Cedric Renggli, Bo Li, Ce Zhang
This analysis requires in-depth understanding of the properties that connect both the transformed space and the raw feature space.
no code implementations • CVPR 2022 • Cedric Renggli, André Susano Pinto, Luka Rimanic, Joan Puigcerver, Carlos Riquelme, Ce Zhang, Mario Lucic
Transfer learning has been recently popularized as a data-efficient alternative to training models from scratch, in particular for computer vision tasks where it provides a remarkably solid baseline.
no code implementations • ICLR 2021 • Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby
We explore the use of expert representations for transfer with a simple, yet effective, strategy.
Ranked #8 on Image Classification on VTAB-1k (using extra training data)
1 code implementation • 8 Oct 2019 • Maurice Weber, Cedric Renggli, Helmut Grabner, Ce Zhang
To that end, we use a family of loss functions that allows to optimize deep image compression depending on the observer and to interpolate between human perceived visual quality and classification accuracy, enabling a more unified view on image compression.
no code implementations • 1 Mar 2019 • Cedric Renggli, Bojan Karlaš, Bolin Ding, Feng Liu, Kevin Schawinski, Wentao Wu, Ce Zhang
Continuous integration is an indispensable step of modern software engineering practices to systematically manage the life cycles of system development.
no code implementations • 17 Oct 2018 • Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, Ji Liu
Most of today's distributed machine learning systems assume {\em reliable networks}: whenever two machines exchange information (e. g., gradients or models), the network should guarantee the delivery of the message.
no code implementations • 22 Feb 2018 • Cedric Renggli, Saleh Ashkboos, Mehdi Aghagolzadeh, Dan Alistarh, Torsten Hoefler
This allreduce is the single communication and thus scalability bottleneck for most machine learning workloads.