Moreover, face recognition experiments demonstrate that our hallucinated depth along with the input RGB images boosts performance across various architectures when compared to a single RGB modality by average values of +1. 2%, +2. 6%, and +2. 6% for IIIT-D, EURECOM, and LFW datasets respectively.
Our novel attention mechanism directs the deep network "where to look" for visual features in the RGB image by focusing the attention of the network using depth features extracted by a Convolution Neural Network (CNN).
A novel attention aware method is proposed to fuse two image modalities, RGB and depth, for enhanced RGB-D facial recognition.