no code implementations • 10 Jul 2024 • Stefan Grafberger
In contrast to existing work, our key idea is to extract "logical query plans" from ML pipeline code relying on popular libraries.
1 code implementation • 30 Apr 2024 • Stefan Grafberger, Paul Groth, Sebastian Schelter
Data scientists develop ML pipelines in an iterative manner: they repeatedly screen a pipeline for potential issues, debug it, and then revise and improve its code according to their findings.
1 code implementation • 6 Jul 2023 • Xiaozhong Lyu, Stefan Grafberger, Samantha Biegel, Shaopeng Wei, Meng Cao, Sebastian Schelter, Ce Zhang
There are exponentially many terms in the multilinear extension, and one key contribution of this paper is a polynomial time algorithm that computes exactly, given a retrieval-augmented model with an additive utility function and a validation set, the data importance of data points in the retrieval corpus using the multilinear extension of the model's utility function.