Search Results for author: Elahe Vahdani

Found 6 papers, 0 papers with code

Deep Learning-based Action Detection in Untrimmed Videos: A Survey

no code implementations30 Sep 2021 Elahe Vahdani, YingLi Tian

The task of temporal activity detection in untrimmed videos aims to localize the temporal boundary of actions and classify the action categories.

Action Detection Action Recognition +1

Cross-Modal Center Loss for 3D Cross-Modal Retrieval

no code implementations CVPR 2021 Longlong Jing, Elahe Vahdani, Jiaxing Tan, YingLi Tian

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities.

Cross-Modal Retrieval Retrieval

Cross-modal Center Loss

no code implementations8 Aug 2020 Longlong Jing, Elahe Vahdani, Jiaxing Tan, YingLi Tian

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities.

Cross-Modal Retrieval Retrieval

An Isolated-Signing RGBD Dataset of 100 American Sign Language Signs Produced by Fluent ASL Signers

no code implementations LREC 2020 Saad Hassan, Larwan Berke, Elahe Vahdani, Longlong Jing, YingLi Tian, Matt Huenerfauth

We have collected a new dataset consisting of color and depth videos of fluent American Sign Language (ASL) signers performing sequences of 100 ASL signs from a Kinect v2 sensor.

Recognizing American Sign Language Nonmanual Signal Grammar Errors in Continuous Videos

no code implementations1 May 2020 Elahe Vahdani, Longlong Jing, YingLi Tian, Matt Huenerfauth

Our system is able to recognize grammatical elements on ASL-HW-RGBD from manual gestures, facial expressions, and head movements and successfully detect 8 ASL grammatical mistakes.

Recognizing American Sign Language Manual Signs from RGB-D Videos

no code implementations7 Jun 2019 Longlong Jing, Elahe Vahdani, Matt Huenerfauth, YingLi Tian

In this paper, we propose a 3D Convolutional Neural Network (3DCNN) based multi-stream framework to recognize American Sign Language (ASL) manual signs (consisting of movements of the hands, as well as non-manual face movements in some cases) in real-time from RGB-D videos, by fusing multimodality features including hand gestures, facial expressions, and body poses from multi-channels (RGB, depth, motion, and skeleton joints).

Cannot find the paper you are looking for? You can Submit a new open access paper.