no code implementations • 31 Aug 2017 • Atousa Torabi, Leonid Sigal
Inspired by recent advances in neural machine translation, that jointly align and translate using encoder-decoder networks equipped with attention, we propose an attentionbased LSTM model for human activity recognition.
no code implementations • 26 Sep 2016 • Atousa Torabi, Niket Tandon, Leonid Sigal
We evaluate our models on large scale LSMDC16 movie dataset for two tasks: 1) Standard Ranking for video annotation and retrieval 2) Our proposed movie multiple-choice test.
Ranked #36 on
Video Retrieval
on MSR-VTT
no code implementations • 12 May 2016 • Anna Rohrbach, Atousa Torabi, Marcus Rohrbach, Niket Tandon, Christopher Pal, Hugo Larochelle, Aaron Courville, Bernt Schiele
In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions.
1 code implementation • 3 Mar 2015 • Atousa Torabi, Christopher Pal, Hugo Larochelle, Aaron Courville
DVS is an audio narration describing the visual elements and actions in a movie for the visually impaired.
5 code implementations • ICCV 2015 • Li Yao, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo Larochelle, Aaron Courville
In this context, we propose an approach that successfully takes into account both the local and global temporal structure of videos to produce descriptions.
no code implementations • 14 May 2013 • Wassim Bouachir, Atousa Torabi, Guillaume-Alexandre Bilodeau, Pascal Blais
This paper proposes a semantic segmentation method for outdoor scenes captured by a surveillance camera.