Scene Understanding

511 papers with code • 3 benchmarks • 43 datasets

Scene Understanding is something that to understand a scene. For instance, iPhone has function that help eye disabled person to take a photo by discribing what the camera sees. This is an example of Scene Understanding.

Libraries

Use these libraries to find Scene Understanding models and implementations
4 papers
2,917
4 papers
1,108
See all 5 libraries.

Latest papers with no code

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

no code yet • 16 Apr 2024

In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions.

PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network

no code yet • 16 Apr 2024

In this study, we propose PreGSU, a generalized pre-trained scene understanding model based on graph attention network to learn the universal interaction and reasoning of traffic scenes to support various downstream tasks.

Depth Estimation using Weighted-loss and Transfer Learning

no code yet • 11 Apr 2024

The optimized loss function is a combination of weighted losses to which enhance robustness and generalization: Mean Absolute Error (MAE), Edge Loss and Structural Similarity Index (SSIM).

Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange

no code yet • 11 Apr 2024

Subsequently, we introduce a context-aware feature learning strategy, which encodes object patterns without relying on their specific context by aggregating object features across various scenes.

Gaga: Group Any Gaussians via 3D-aware Memory Bank

no code yet • 11 Apr 2024

We introduce Gaga, a framework that reconstructs and segments open-world 3D scenes by leveraging inconsistent 2D masks predicted by zero-shot segmentation models.

O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation

no code yet • 10 Apr 2024

Online construction of open-ended language scenes is crucial for robotic applications, where open-vocabulary interactive scene understanding is required.

Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles

no code yet • 10 Apr 2024

In this sense, explainability of real-time decisions is a crucial and natural requirement for building trust in autonomous vehicles.

QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding

no code yet • 9 Apr 2024

Understanding the structural organisation of 3D indoor scenes in terms of rooms is often accomplished via floorplan extraction.

DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning

no code yet • 9 Apr 2024

We implement a baseline by applying cylindrical rectification on the fisheye images and using a standard LSS-based BEV segmentation model.

Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation

no code yet • 6 Apr 2024

Experimental results on FineGrip demonstrate the feasibility of the panoptic perception task and the beneficial effect of multi-task joint optimization on individual tasks.