no code implementations • 1 Apr 2024 • Jia Gong, Lin Geng Foo, Yixuan He, Hossein Rahmani, Jun Liu
Sign Language Translation (SLT) is a challenging task that aims to translate sign videos into spoken language.
no code implementations • 1 Apr 2024 • Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Jun Liu
Action detection aims to localize the starting and ending points of action instances in untrimmed videos, and predict the classes of those instances.
1 code implementation • 27 Feb 2024 • Xinyu Yang, Hossein Rahmani, Sue Black, Bryan M. Williams
Class activation maps (CAMs) are commonly employed in weakly supervised semantic segmentation (WSSS) to produce pseudo-labels.
Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation
no code implementations • 14 Feb 2024 • Shiqi Yang, Hanlin Qin, Shuai Yuan, Xiang Yan, Hossein Rahmani
However, when applied to the infrared destriping task, it becomes challenging for the vanilla auxiliary generator to consistently produce vertical noise under unsupervised constraints.
1 code implementation • 3 Jan 2024 • Haopeng Li, Andong Deng, Qiuhong Ke, Jun Liu, Hossein Rahmani, Yulan Guo, Bernt Schiele, Chen Chen
Reasoning over sports videos for question answering is an important task with numerous applications, such as player training and information retrieval.
no code implementations • 21 Dec 2023 • Zheheng Jiang, Hossein Rahmani, Sue Black, Bryan M. Williams
This is followed by a self-adaptive deformation that deforms the hand from the canonical space to the target pose, adapting to the dynamic changing of canonical points which, in contrast to the common practice of subdividing the MANO model, offers greater flexibility and results in improved geometry fitting.
no code implementations • 27 Aug 2023 • Lin Geng Foo, Hossein Rahmani, Jun Liu
Due to its wide range of applications and the demonstrated potential of recent works, AIGC developments have been attracting lots of attention recently, and AIGC methods have been developed for various data modalities, such as image, video, text, 3D shape (as voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human avatar (body and head), 3D motion, and audio -- each presenting different characteristics and challenges.
no code implementations • ICCV 2023 • Lin Geng Foo, Jia Gong, Hossein Rahmani, Jun Liu
Inspired by their capability, we explore a diffusion-based approach for human mesh recovery, and propose a Human Mesh Diffusion (HMDiff) framework which frames mesh recovery as a reverse diffusion process.
1 code implementation • CVPR 2023 • Zheheng Jiang, Hossein Rahmani, Sue Black, Bryan M. Williams
The experimental results demonstrate our probabilistic model's state-of-the-art accuracy in 3D hand and texture reconstruction from a single image in both training schemes, including in the presence of severe occlusions.
Ranked #2 on 3D Hand Pose Estimation on HO-3D
no code implementations • CVPR 2023 • Tianjiao Li, Lin Geng Foo, Ping Hu, Xindi Shang, Hossein Rahmani, Zehuan Yuan, Jun Liu
Pre-training VTs on such corrupted data can be challenging, especially when we pre-train via the masked autoencoding approach, where both the inputs and masked ``ground truth" targets can potentially be unreliable in this case.
no code implementations • 1 Apr 2023 • Jianhong Pan, Siyuan Yang, Lin Geng Foo, Qiuhong Ke, Hossein Rahmani, Zhipeng Fan, Jun Liu
Currently, salience-based channel pruning makes continuous breakthroughs in network compression.
no code implementations • 1 Apr 2023 • Jianhong Pan, Lin Geng Foo, Qichen Zheng, Zhipeng Fan, Hossein Rahmani, Qiuhong Ke, Jun Liu
Dynamic neural networks can greatly reduce computation redundancy without compromising accuracy by adapting their structures based on the input.
no code implementations • CVPR 2023 • Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Qiuhong Ke, Jun Liu
We propose a Unified Pose Sequence Modeling approach to unify heterogeneous human behavior understanding tasks based on pose data, e. g., action recognition, 3D pose estimation and 3D early action prediction.
2 code implementations • CVPR 2023 • Jia Gong, Lin Geng Foo, Zhipeng Fan, Qiuhong Ke, Hossein Rahmani, Jun Liu
Monocular 3D human pose estimation is quite challenging due to the inherent ambiguity and occlusion, which often lead to high uncertainty and indeterminacy.
Ranked #11 on 3D Human Pose Estimation on MPI-INF-3DHP
no code implementations • 3 Sep 2022 • Tianjiao Li, Lin Geng Foo, Qiuhong Ke, Hossein Rahmani, Anran Wang, Jinghua Wang, Jun Liu
We design a novel Dynamic Spatio-Temporal Specialization (DSTS) module, which consists of specialized neurons that are only activated for a subset of samples that are highly similar.
no code implementations • 25 Jul 2022 • Yunsheng Pang, Qiuhong Ke, Hossein Rahmani, James Bailey, Jun Liu
Human interaction recognition is very important in many applications.
Ranked #2 on Human Interaction Recognition on SBU
no code implementations • 20 Jul 2022 • Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Qiuhong Ke, Jun Liu
Early action prediction aims to successfully predict the class label of an action before it is completely performed.
1 code implementation • CVPR 2022 • Zheheng Jiang, Hossein Rahmani, Plamen Angelov, Sue Black, Bryan M. Williams
Deep learning for graph matching has received growing interest and developed rapidly in the past decade.
Ranked #4 on Graph Matching on PASCAL VOC (matching accuracy metric)
no code implementations • CVPR 2022 • Jia Gong, Zhipeng Fan, Qiuhong Ke, Hossein Rahmani, Jun Liu
The existing pose estimation approaches often require a large number of annotated images to attain good estimation performance, which are laborious to acquire.
no code implementations • 23 Sep 2021 • Haoxuan Qu, Hossein Rahmani, Li Xu, Bryan Williams, Jun Liu
In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order.
no code implementations • 18 Aug 2021 • Haoran Peng, He Huang, Li Xu, Tianjiao Li, Jun Liu, Hossein Rahmani, Qiuhong Ke, Zhicheng Guo, Cong Wu, Rongchang Li, Mang Ye, Jiahao Wang, Jiaxu Zhang, Yuanzhong Liu, Tao He, Fuwei Zhang, Xianbin Liu, Tao Lin
In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021.
1 code implementation • 4 Aug 2021 • Nathanael L. Baisa, Bryan Williams, Hossein Rahmani, Plamen Angelov, Sue Black
In this paper, we propose a novel hand-based person recognition method for the purpose of criminal investigations since the hand image is often the only available information in cases of serious crime such as sexual abuse.
no code implementations • 13 Jan 2021 • Nathanael L. Baisa, Bryan Williams, Hossein Rahmani, Plamen Angelov, Sue Black
Our proposed method, Global and Part-Aware Network (GPA-Net), creates global and local branches on the conv-layer for learning robust discriminative global and part-level features.
no code implementations • ICCV 2021 • Tianjiao Li, Qiuhong Ke, Hossein Rahmani, Rui En Ho, Henghui Ding, Jun Liu
This makes online continual action recognition a challenging task.
no code implementations • 28 Dec 2020 • Maryam Dialameh, Ali Hamzeh, Hossein Rahmani, Amir Reza Radmard, Safoura Dialameh
Secondly, we propose a deep learning model for screening COVID-19 using our proposed CT dataset and report the baseline results.
no code implementations • 22 Dec 2020 • Zehua Sun, Qiuhong Ke, Hossein Rahmani, Mohammed Bennamoun, Gang Wang, Jun Liu
Human Action Recognition (HAR) aims to understand human behavior and assign a label to each action.
1 code implementation • 31 Oct 2020 • Maryam Dialameh, Ali Hamzeh, Hossein Rahmani
This linear constraint, which is further adjusted by a regularization factor, prevents the network from the risk of overfitting.
no code implementations • IEEE Transactions on Image Processing 2019 • Qiuhong Ke, Mohammed Bennamoun, Hossein Rahmani, Senjian An, Ferdous Sohel, Farid Boussaid
Human actions represented with 3D skeleton sequences are robust to clustered backgrounds and illumination changes.
Ranked #4 on Skeleton Based Action Recognition on SYSU 3D
no code implementations • ICCV 2017 • Hossein Rahmani, Mohammed Bennamoun
Depth sensors open up possibilities of dealing with the human action recognition problem by providing 3D human skeleton data and depth images of the scene.
no code implementations • CVPR 2016 • Hossein Rahmani, Ajmal Mian
We propose a human pose representation model that transfers human poses acquired from different unknown views to a view-invariant high-level space.
no code implementations • 2 Feb 2016 • Hossein Rahmani, Ajmal Mian, Mubarak Shah
The strength of our technique is that we learn a single R-NKTM for all actions and all viewpoints for knowledge transfer of any real human action video without the need for re-training or fine-tuning the model.
no code implementations • CVPR 2015 • Hossein Rahmani, Ajmal Mian
We propose unsupervised learning of a non-linear model that transfers knowledge from multiple views to a canonical view.
no code implementations • 24 Sep 2014 • Hossein Rahmani, Arif Mahmood, Du Huynh, Ajmal Mian
We propose the Histogram of Oriented Principal Components (HOPC) descriptor that is robust to noise, viewpoint, scale and action speed variations.
no code implementations • 17 Aug 2014 • Hossein Rahmani, Arif Mahmood, Du. Q. Huynh, Ajmal Mian
In contrast, we directly process the pointclouds and propose a new technique for action recognition which is more robust to noise, action speed and viewpoint variations.
no code implementations • 17 Aug 2014 • Hossein Rahmani, Arif Mahmood, Du Huynh, Ajmal Mian
We use the Histogram of Oriented Gradient (HOG3D) feature to encode the information in each cell.
no code implementations • 24 Feb 2014 • Seyed Mostafa Kia, Hossein Rahmani, Reza Mortezaei, Mohsen Ebrahimi Moghaddam, Amer Namazi
To test the proposed method, performance of system was evaluated over 18354 download images from internet.