Data Summarization

33 papers with code • 0 benchmarks • 2 datasets

Data Summarization is a central problem in the area of machine learning, where we want to compute a small summary of the data.

Source: How to Solve Fair k-Center in Massive Data Models

Benchmarks

Add a Result

These leaderboards are used to track progress in Data Summarization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Data Summarization models and implementations

MikeJaredS/hermiter

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Semi-supervised Batch Active Learning via Bilevel Optimization

zalanborsos/bilevel_coresets • • 19 Oct 2020

Active learning is an effective technique for reducing the labeling cost by improving data efficiency.

Paper
Code

Very Fast Streaming Submodular Function Maximization

sbuschjaeger/SubmodularStreamingMaximization • 20 Oct 2020

Data summarization has become a valuable tool in understanding even terabytes of data.

Paper
Code

Synthetic Dataset Generation of Driver Telematics

sstocksieker/dair • 30 Jan 2021

This article describes techniques employed in the production of a synthetic dataset of driver telematics emulated from a similar real insurance dataset.

Paper
Code

Submodlib: A Submodular Optimization Library

decile-team/submodlib • 22 Feb 2022

A recent work has also leveraged submodular functions to propose submodular information measures which have been found to be very useful in solving the problems of guided subset selection and guided summarization.

Paper
Code

Group Equality in Adaptive Submodular Maximization

j-yuan/gequality • 7 Jul 2022

In this paper, we study the classic submodular maximization problem subject to a group equality constraint under both non-adaptive and adaptive settings.

Paper
Code

Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data

neato47/neural-numeric-to-text-generation • • 11 Jul 2022

We examine recurrent, convolutional, and Transformer-based encoder-decoder models to automatically generate natural language summaries from numeric temporal personal health data.

Paper
Code

Streaming Algorithms for Diversity Maximization with Fairness Constraints

yhwang1990/code-fdm • 30 Jul 2022

Given a set $X$ of $n$ elements, it asks to select a subset $S$ of $k \ll n$ elements with maximum \emph{diversity}, as quantified by the dissimilarities among the elements in $S$.

Paper
Code

Balancing Utility and Fairness in Submodular Maximization (Technical Report)

yhwang1990/code-bsm-release • 2 Nov 2022

Submodular function maximization is a fundamental combinatorial optimization problem with plenty of applications -- including data summarization, influence maximization, and recommendation.

Paper
Code

Black-box Coreset Variational Inference

facebookresearch/blackbox-coresets-vi • • 4 Nov 2022

Recent advances in coreset methods have shown that a selection of representative datapoints can replace massive volumes of data for Bayesian inference, preserving the relevant statistical information and significantly accelerating subsequent downstream tasks.

Paper
Code

MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering

huggingface/transformers • • 19 Dec 2022

Visual language data such as plots, charts, and infographics are ubiquitous in the human world.

Paper
Code

Data Summarization

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result