SketchGraphs is a dataset of 15 million sketches extracted from real-world CAD models intended to facilitate research in both ML-aided design and geometric program induction. Each sketch is represented as a geometric constraint graph where edges denote designer-imposed geometric relationships between primitives, the nodes of the graph.
11 PAPERS • NO BENCHMARKS YET
The dataset contains constructed multi-modal features (visual and textual), pseudo-labels (on heritage values and attributes), and graph structures (with temporal, social, and spatial links) constructed
1 PAPER • NO BENCHMARKS YET
SciGraphQA is a large-scale, open-domain dataset focused on generating multi-turn conversational question-answering dialogues centered around understanding and describing scientific graphs and figures. Each sample in ScFiGraphQA consists of a scientific graph image sourced from papers on ArXiv, accompanied by rich textual context including the paper's title, abstract, figure caption, and a paragraph The key motivation behind SciGraphQA is providing a large-scale resource to support research and development of multi-modal AI systems that can engage in informative, open-ended conversations about graphs Potential use cases of SciGraphQA include pre-training and benchmarking multi-modal conversational models for scientific graph comprehension, building AI assistants that can discuss data insights, and The academic source material also provides a way to evaluate model capabilities on expert-level graphs spanning diverse topics and complex visual encodings.
3 PAPERS • 1 BENCHMARK
MMKG is a collection of three knowledge graphs for link prediction and entity matching research. Contrary to other knowledge graph datasets, these knowledge graphs contain both numerical features and images for all entities as well as entity alignments between pairs of KGs. The three knowledge graphs augmented with numerical features and images are called FB15k, YAGO15k, and DBPEDIA15k.
42 PAPERS • 5 BENCHMARKS
Chest ImaGenome is a dataset with a scene graph data structure to describe 242,072 images. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, the following are provided: i) 1256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph ii) over 670,000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph
13 PAPERS • NO BENCHMARKS YET
Multi-Modal Hate Speech Detection with Graph Context. 18k+ labels, 8k+ discussions, 900k+ comments.
Update on 3DIdent, where we introduce six additional object classes (Hare, Dragon, Cow, Armadillo, Horse, and Head), and impose a causal graph over the latent variables.
12 PAPERS • 1 BENCHMARK
…The images are synthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts.
38 PAPERS • 1 BENCHMARK
…for coarse prediction are provided, i.e. photographic vs. non-photographic, and smaller fine-grained prediction tasks where the non-photographic class is broken down into five classes: maps, drawings, graphs
…Along with the road network graph, it includes trip records represented as sequences of visited nodes (making the dataset suitable both for path-blind and path-aware settings).
2 PAPERS • 1 BENCHMARK
…We introduce the new task of multimodal analogical reasoning over knowledge graphs, which requires multimodal reasoning ability with the help of background knowledge.
1 PAPER • 1 BENCHMARK
…It consists of a large corpus of high-resolution satellite imagery and ground truth road network graphs covering the urban core of forty cities across six countries.
37 PAPERS • 2 BENCHMARKS
…The floorplans are annotated with room outline polygons, doors/windows as line segments, object-icons as axis-aligned bounding boxes, room-door-room connectivity graphs, and photo-room assignments. Generated room-door-room connectivity graphs for floorplans. Annotated all windows, doors, and other wall openings, and associated them with corresponding rooms.
…For each image-question pair in the CLEVR dataset, CLEVR-X contains multiple structured textual explanations which are derived from the original scene graphs.
4 PAPERS • 1 BENCHMARK
…Each training and validation image is also associated with scene graph annotations describing the classes and attributes of those objects in the scene, and their pairwise relations.
433 PAPERS • 5 BENCHMARKS
The Toulouse Road Network dataset describes patches of road maps from the city of Toulouse, represented both as spatial graphs G = (A, X) and as grayscale segmentation images.
…Specifically, that authors construct a dialog grammar that is grounded in the scene graphs of the images from the CLEVR dataset.
10 PAPERS • NO BENCHMARKS YET
…CLEVR dataset consists of: a training set of 70k images and 700k questions, a validation set of 15k images and 150k questions, a test set of 15k images and 150k questions about objects, answers, scene graphs
577 PAPERS • 2 BENCHMARKS
…Filtering images The first step is focused on filtering images that have meaningful scene graphs and captions. We filtered all the scene graphs that did not contain any edges. images pass this filter. The relationships should be verbs and not contain nouns or pronouns. We filter all scene graphs that contain an edge not tagged as a verb or that the tag is not in an ad-hoc list of allowed non-verb keywords.
1 PAPER • 2 BENCHMARKS
MuMiN is a misinformation graph dataset containing rich social media data (tweets, replies, users, images, articles, hashtags), spanning 21 million tweets belonging to 26 thousand Twitter threads, each
4 PAPERS • 3 BENCHMARKS
…We create the puzzles to encompass a diverse array of mathematical and algorithmic topics such as boolean logic, combinatorics, graph theory, optimization, search, etc., aiming to evaluate the gap between
…The datasets are generated by repurposing the Visual Genome scene graphs and region descriptions and applying handcrafted templates and GPT-3.