🔔 Share your dataset with the ML community!

Filter by Modality (clear)

Filter by Task

Filter by Language

22 dataset results for Graphs AND Images

SketchGraphs is a dataset of 15 million sketches extracted from real-world CAD models intended to facilitate research in both ML-aided design and geometric program induction. Each sketch is represented as a geometric constraint graph where edges denote designer-imposed geometric relationships between primitives, the nodes of the graph.

11 PAPERS • NO BENCHMARKS YET

HeriGraph (Multimodal Machine Learning Datasets on Graphs of Heritage Values and Attributes)

The dataset contains constructed multi-modal features (visual and textual), pseudo-labels (on heritage values and attributes), and graph structures (with temporal, social, and spatial links) constructed

1 PAPER • NO BENCHMARKS YET

SciGraphQA

SciGraphQA is a large-scale, open-domain dataset focused on generating multi-turn conversational question-answering dialogues centered around understanding and describing scientific graphs and figures. Each sample in ScFiGraphQA consists of a scientific graph image sourced from papers on ArXiv, accompanied by rich textual context including the paper's title, abstract, figure caption, and a paragraph The key motivation behind SciGraphQA is providing a large-scale resource to support research and development of multi-modal AI systems that can engage in informative, open-ended conversations about graphs Potential use cases of SciGraphQA include pre-training and benchmarking multi-modal conversational models for scientific graph comprehension, building AI assistants that can discuss data insights, and The academic source material also provides a way to evaluate model capabilities on expert-level graphs spanning diverse topics and complex visual encodings.

3 PAPERS • 1 BENCHMARK

MMKG

MMKG is a collection of three knowledge graphs for link prediction and entity matching research. Contrary to other knowledge graph datasets, these knowledge graphs contain both numerical features and images for all entities as well as entity alignments between pairs of KGs. The three knowledge graphs augmented with numerical features and images are called FB15k, YAGO15k, and DBPEDIA15k.

42 PAPERS • 5 BENCHMARKS

Chest ImaGenome

Chest ImaGenome is a dataset with a scene graph data structure to describe 242,072 images. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, the following are provided: i) 1256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph ii) over 670,000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph

13 PAPERS • NO BENCHMARKS YET

HatefulDiscussions

Multi-Modal Hate Speech Detection with Graph Context. 18k+ labels, 8k+ discussions, 900k+ comments.

1 PAPER • NO BENCHMARKS YET

Causal3DIdent

Update on 3DIdent, where we introduce six additional object classes (Hare, Dragon, Cow, Armadillo, Horse, and Head), and impose a causal graph over the latent variables.

12 PAPERS • 1 BENCHMARK

FigureQA

…The images are synthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts.

38 PAPERS • 1 BENCHMARK

ImagiFilter

…for coarse prediction are provided, i.e. photographic vs. non-photographic, and smaller fine-grained prediction tasks where the non-photographic class is broken down into five classes: maps, drawings, graphs

1 PAPER • NO BENCHMARKS YET

TTE-A&O (Travel Time Estimation: Abakan and Omsk)

…Along with the road network graph, it includes trip records represented as sequences of visited nodes (making the dataset suitable both for path-blind and path-aware settings).

2 PAPERS • 1 BENCHMARK

MARS (Multimodal Analogical Reasoning dataSet)

…We introduce the new task of multimodal analogical reasoning over knowledge graphs, which requires multimodal reasoning ability with the help of background knowledge.

1 PAPER • 1 BENCHMARK

RoadTracer

…It consists of a large corpus of high-resolution satellite imagery and ground truth road network graphs covering the urban core of forty cities across six countries.

37 PAPERS • 2 BENCHMARKS

Rent3D++

…The floorplans are annotated with room outline polygons, doors/windows as line segments, object-icons as axis-aligned bounding boxes, room-door-room connectivity graphs, and photo-room assignments. Generated room-door-room connectivity graphs for floorplans. Annotated all windows, doors, and other wall openings, and associated them with corresponding rooms.

2 PAPERS • 1 BENCHMARK

CLEVR-X

…For each image-question pair in the CLEVR dataset, CLEVR-X contains multiple structured textual explanations which are derived from the original scene graphs.

4 PAPERS • 1 BENCHMARK

GQA

…Each training and validation image is also associated with scene graph annotations describing the classes and attributes of those objects in the scene, and their pairwise relations.

433 PAPERS • 5 BENCHMARKS

TRN (Toulouse Road Network)

The Toulouse Road Network dataset describes patches of road maps from the city of Toulouse, represented both as spatial graphs G = (A, X) and as grayscale segmentation images.

2 PAPERS • 1 BENCHMARK

CLEVR-Dialog

…Specifically, that authors construct a dialog grammar that is grounded in the scene graphs of the images from the CLEVR dataset.

10 PAPERS • NO BENCHMARKS YET

CLEVR (Compositional Language and Elementary Visual Reasoning)

…CLEVR dataset consists of: a training set of 70k images and 700k questions, a validation set of 15k images and 150k questions, a test set of 15k images and 150k questions about objects, answers, scene graphs

577 PAPERS • 2 BENCHMARKS

ConQA (Conceptual Query Answering)

…Filtering images The first step is focused on filtering images that have meaningful scene graphs and captions. We filtered all the scene graphs that did not contain any edges. images pass this filter. The relationships should be verbs and not contain nouns or pronouns. We filter all scene graphs that contain an edge not tagged as a verb or that the tag is not in an ad-hoc list of allowed non-verb keywords.

1 PAPER • 2 BENCHMARKS

MuMiN

MuMiN is a misinformation graph dataset containing rich social media data (tweets, replies, users, images, articles, hashtags), spanning 21 million tweets belonging to 26 thousand Twitter threads, each

4 PAPERS • 3 BENCHMARKS

AlgoPuzzleVQA

…We create the puzzles to encompass a diverse array of mathematical and algorithmic topics such as boolean logic, combinatorics, graph theory, optimization, search, etc., aiming to evaluate the gap between

1 PAPER • 1 BENCHMARK

CREPE (Compositional REPresentation Evaluation)

…The datasets are generated by repurposing the Visual Genome scene graphs and region descriptions and applying handcrafted templates and GPT-3.

2 PAPERS • 1 BENCHMARK