Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on four video DA datasets (e. g. 7. 9% accuracy gain over "Source only" from 73. 9% to 81. 8% on "HMDB --> UCF", and 10. 3% gain on "Kinetics --> Gameplay").
Ranked #1 on Domain Adaptation on UCF --> HMDB (full)
However, little work has been done for game image captioning which has some unique characteristics and requirements.