Designing neural architectures requires immense manual efforts.
Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on four video DA datasets (e. g. 7. 9% accuracy gain over "Source only" from 73. 9% to 81. 8% on "HMDB --> UCF", and 10. 3% gain on "Kinetics --> Gameplay").
Ranked #1 on Domain Adaptation on UCF --> HMDB (full)
However, little work has been done for game image captioning which has some unique characteristics and requirements.