32 papers with code • 0 benchmarks • 2 datasets
Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.
These leaderboards are used to track progress in Data Summarization
LibrariesUse these libraries to find Data Summarization models and implementations
We propose to simultaneously distill both images and their labels, thus assigning each synthetic sample a `soft' label (a distribution of labels).
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision
In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples.
In particular, we study the problem of label distillation - creating synthetic labels for a small set of real images, and show it to be more effective than the prior image-based approach to dataset distillation.
To treat the non-stationary setting, we introduce a novel, exponentially weighted estimator for the Spearman rank correlation, which allows the local nonparametric correlation of a bivariate data stream to be tracked.
These algorithms go beyond existing sequential quantile estimation algorithms in that they allow arbitrary quantiles (as opposed to pre-specified quantiles) to be estimated at any point in time.
Structured data summarization involves generation of natural language summaries from structured input data.
Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution.