no code implementations • 12 Apr 2024 • Linhuang Wang, Xin Kang, Fei Ding, Satoshi Nakagawa, Fuji Ren
Our approach takes spatial features of different scales extracted by CNN and feeds them into a Multi-scale Embedding Layer (MELayer).
Action Recognition Attribute +3