Replay is a collection of multi-view, multi-modal videos of humans interacting socially. Each scene is filmed in high production quality, from different viewpoints with several static cameras, as well as wearable action cameras, and recorded with a large array of microphones at different positions in the room. The full Replay dataset consists of 68 scenes of social interactions between people, such as playing boarding games, exercising, or unwrapping presents. Each scene is about 5 minutes long and filmed with 12 cameras, static and dynamic. Audio is captured separately by 12 binaural microphones and additional near-range microphones for each actor and for each egocentric video. All sensors are temporally synchronized, undistorted, geometrically calibrated, and color calibrated.
Paper | Code | Results | Date | Stars |
---|