ERA: A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos

30 Jan 2020  ·  Lichao Mou, Yuansheng Hua, Pu Jin, Xiao Xiang Zhu ·

Along with the increasing use of unmanned aerial vehicles (UAVs), large volumes of aerial videos have been produced. It is unrealistic for humans to screen such big data and understand their contents. Hence methodological research on the automatic understanding of UAV videos is of paramount importance. In this paper, we introduce a novel problem of event recognition in unconstrained aerial videos in the remote sensing community and present a large-scale, human-annotated dataset, named ERA (Event Recognition in Aerial videos), consisting of 2,864 videos each with a label from 25 different classes corresponding to an event unfolding 5 seconds. The ERA dataset is designed to have a significant intra-class variation and inter-class similarity and captures dynamic events in various circumstances and at dramatically various scales. Moreover, to offer a benchmark for this task, we extensively validate existing deep networks. We expect that the ERA dataset will facilitate further progress in automatic aerial video comprehension. The website is

PDF Abstract
No code implementations yet. Submit your code now



Introduced in the Paper:


Used in the Paper:

Okutama-Action AIDER UCLA Aerial Event Dataset

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here