We present a simple deep learning framework to simultaneously predict
keypoint locations and their respective visibilities and use those to achieve
state-of-the-art performance for fine-grained classification. We show that by
conditioning the predictions on object proposals with sufficient image support,
our method can do well without complicated spatial reasoning...
inference methods with robustness to outliers, yield state-of-the-art for
keypoint localization. We demonstrate the effectiveness of our accurate
keypoint localization and visibility prediction on the fine-grained bird
recognition task with and without ground truth bird bounding boxes, and
outperform existing state-of-the-art methods by over 2%.