NAVVS (Naturalistic audio-visual volumetric sequences)

Introduced by Stenzel et al. in Naturalistic audio-visual volumetric sequences dataset of sounding actions for six degree-of-freedom interaction

NAVVS is a volumetric dataset of naturalistic actions whose captured sound and visual appearance yield an open-access resource for immersive and interactive research within an artificial 3D audio-visual environment, such as VR/AR/XR with six degree-of-freedom (6DoF) interaction. It includes a variety of short volumetric sounding actions. It provides a valuable resource for multimodal research and testing under realistic conditions. The dataset includes ten different actions designed with both semantic and acoustic diversity. For each action, four 2-seconds takes are available to provide a total of forty audio-visual clips.

The scenes were captured at the Centre for Vision, Speech & Signal Processing (CVSSP) of the University of Surrey (UK) with the aid of multiple cameras and multiple microphones. Along with the final clips' volumetric textured instances and the audio stereo mix, additional data is provided. This includes: the separated microphones' audio channels, raw images from the 16 UHD cameras, binary masks, camera calibration data, coarse visual hull reconstruction, and volumetric stereo refinement.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Custom (research-only)

Modalities


Languages