Data Summarization

33 papers with code • 0 benchmarks • 2 datasets

Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.

Source: How to Solve Fair k-Center in Massive Data Models

Benchmarks

Add a Result

These leaderboards are used to track progress in Data Summarization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Data Summarization models and implementations

MikeJaredS/hermiter

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Soft-Label Dataset Distillation and Text Dataset Distillation

ilia10000/dataset-distillation • • 6 Oct 2019

We propose to simultaneously distill both images and their labels, thus assigning each synthetic sample a `soft' label (a distribution of labels).

Paper
Code

Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision

zaeemzadeh/IPM • CVPR 2019

In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples.

Paper
Code

Flexible Dataset Distillation: Learn Labels Instead of Images

ondrejbohdal/label-distillation • • 15 Jun 2020

In particular, we study the problem of label distillation - creating synthetic labels for a small set of real images, and show it to be more effective than the prior image-based approach to dataset distillation.

Paper
Code

Sequential estimation of Spearman rank correlation using Hermite series estimators

MikeJaredS/hermiter • 11 Dec 2020

To treat the non-stationary setting, we introduce a novel, exponentially weighted estimator for the Spearman rank correlation, which allows the local nonparametric correlation of a bivariate data stream to be tracked.

Paper
Code

Sequential Quantiles via Hermite Series Density Estimation

MikeJaredS/hermiter • 17 Jul 2015

These algorithms go beyond existing sequential quantile estimation algorithms in that they allow arbitrary quantiles (as opposed to pre-specified quantiles) to be estimated at any point in time.

Paper
Code

Scalable k-Means Clustering via Lightweight Coresets

webis-de/small-text • • 27 Feb 2017

As such, they have been successfully used to scale up clustering models to massive data sets.

Paper
Code

An Online Algorithm for Nonparametric Correlations

wxiao0421/onlineNPCORR • 5 Dec 2017

This paper investigates the problem of computing nonparametric correlations on the fly for streaming data.

Paper
Code

Fair and Diverse DPP-based Data Summarization

DamianStraszak/FairDiverseDPPSampling • ICML 2018

Sampling methods that choose a subset of the data proportional to its diversity in the feature space are popular for data summarization.

Paper
Code

A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization

parajain/StructuredData_To_Descriptions • • NAACL 2018

Structured data summarization involves generation of natural language summaries from structured input data.

Paper
Code

Coverage-Based Designs Improve Sample Mining and Hyper-Parameter Optimization

gowthamasu/Coverage_based_sample_design • 5 Sep 2018

Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning, and sequential optimization has become a popular solution.

Paper
Code

Data Summarization

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result