Since we want to generalize to data captured with a real head-mounted camera, we also extended the EgoPW training dataset. For this, we first reconstruct the scene geometry from the egocentric image sequences of the EgoPW training dataset with a Structure-from-Motion (SfM) algorithm. This step provides a dense reconstruction of the background scene. The global scale of the reconstruction is recovered from known objects present in the sequences, such as laptops and chairs. We further render the depth maps of the scene in the egocentric perspective based on the reconstructed geometry. Our EgoPW-Scene dataset contains 92 K frames in total, which are distributed in 30 sequences performed by 5 actors.
Paper | Code | Results | Date | Stars |
---|