Scene Classification
122 papers with code • 2 benchmarks • 21 datasets
Scene Classification is a task in which scenes from photographs are categorically classified. Unlike object classification, which focuses on classifying prominent objects in the foreground, Scene Classification uses the layout of objects within the scene, in addition to the ambient context, for classification.
Source: Scene classification with Convolutional Neural Networks
Datasets
Latest papers
AudioLog: LLMs-Powered Long Audio Logging with Hybrid Token-Semantic Contrastive Learning
This paper presents AudioLog, a large language models (LLMs)-powered audio logging system with hybrid token-semantic contrastive learning.
CD-COCO: A Versatile Complex Distorted COCO Database for Scene-Context-Aware Computer Vision
These new local distortions are generated by considering the scene context of the images that guarantees a high level of photo-realism.
Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification
The results show the feasibility of recognizing diverse acoustic scenes based on the audio event-relational graph.
Bringing the Discussion of Minima Sharpness to the Audio Domain: a Filter-Normalised Evaluation for Acoustic Scene Classification
The correlation between the sharpness of loss minima and generalisation in the context of deep neural networks has been subject to discussion for a long time.
DeCUR: decoupling common & unique representations for multimodal self-supervision
We propose Decoupling Common and Unique Representations (DeCUR), a simple yet effective method for multimodal self-supervised learning.
SOAR: Scene-debiasing Open-set Action Recognition
The former prevents the decoder from reconstructing the video background given video features, and thus helps reduce the background information in feature learning.
Efficient Multi-Task Scene Analysis with RGB-D Transformers
However, we show that the dual CNN-based encoder of EMSANet can be replaced with a single Transformer-based encoder.
Multi-level Cross-modal Feature Alignment via Contrastive Learning towards Zero-shot Classification of Remote Sensing Image Scenes
To address the zero-shot image scene classification, the cross-modal feature alignment methods have been proposed in recent years.
Device-Robust Acoustic Scene Classification via Impulse Response Augmentation
However, we also show that DIR augmentation and Freq-MixStyle are complementary, achieving a new state-of-the-art performance on signals recorded by devices unseen during training.
Vision-Language Models in Remote Sensing: Current Progress and Future Trends
Existing AI-related research in remote sensing primarily focuses on visual understanding tasks while neglecting the semantic understanding of the objects and their relationships.