no code implementations • 19 Jul 2021 • Yan Bin Ng, Basura Fernando
A temporal recurrent encoder captures temporal information of input videos while a self-attention model is used to attend on relevant feature dimensions of the input space.
no code implementations • 10 Dec 2019 • Yan Bin Ng, Basura Fernando
We extend our action sequence forecasting model to perform weakly supervised action forecasting on two challenging datasets, the Breakfast and the 50Salads.
no code implementations • WS 2019 • Aliaks Huminski, R, Yan Bin Ng, Kenneth Kwok, Francis Bond
Natural language communication between machines and humans are still constrained.
no code implementations • 7 Oct 2019 • Yan Bin Ng, Basura Fernando
Furthermore, we use our model that is trained to output action sequences to solve downstream tasks; such as video captioning and action localization.