Scene Understanding
511 papers with code • 3 benchmarks • 43 datasets
Scene Understanding is something that to understand a scene. For instance, iPhone has function that help eye disabled person to take a photo by discribing what the camera sees. This is an example of Scene Understanding.
Benchmarks
These leaderboards are used to track progress in Scene Understanding
Libraries
Use these libraries to find Scene Understanding models and implementationsDatasets
Subtasks
Latest papers with no code
PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction
In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions.
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network
In this study, we propose PreGSU, a generalized pre-trained scene understanding model based on graph attention network to learn the universal interaction and reasoning of traffic scenes to support various downstream tasks.
Depth Estimation using Weighted-loss and Transfer Learning
The optimized loss function is a combination of weighted losses to which enhance robustness and generalization: Mean Absolute Error (MAE), Edge Loss and Structural Similarity Index (SSIM).
Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange
Subsequently, we introduce a context-aware feature learning strategy, which encodes object patterns without relying on their specific context by aggregating object features across various scenes.
Gaga: Group Any Gaussians via 3D-aware Memory Bank
We introduce Gaga, a framework that reconstructs and segments open-world 3D scenes by leveraging inconsistent 2D masks predicted by zero-shot segmentation models.
O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation
Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required.
Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles
In this sense, explainability of real-time decisions is a crucial and natural requirement for building trust in autonomous vehicles.
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding
Understanding the structural organisation of 3D indoor scenes in terms of rooms is often accomplished via floorplan extraction.
DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning
We implement a baseline by applying cylindrical rectification on the fisheye images and using a standard LSS-based BEV segmentation model.
Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation
Experimental results on FineGrip demonstrate the feasibility of the panoptic perception task and the beneficial effect of multi-task joint optimization on individual tasks.