no code implementations • 29 Feb 2024 • Qidan Zhu, Jing Li, Fei Yuan, Quan Gan
Changes in facial expression, head movement, body movement and gesture movement are remarkable cues in sign language recognition, and most of the current continuous sign language recognition(CSLR) research methods mainly focus on static images in video sequences at the frame-level feature extraction stage, while ignoring the dynamic changes in the images.
no code implementations • 13 Mar 2023 • Qidan Zhu, Jing Li, Fei Yuan, Quan Gan
It is then used to combine cross-resolution knowledge distillation and traditional knowledge distillation methods to form a CSLR model based on cross-resolution knowledge distillation (CRKD).
no code implementations • 7 Nov 2022 • Qidan Zhu, Jing Li, Fei Yuan, Quan Gan
The ultimate goal of continuous sign language recognition(CSLR) is to facilitate the communication between special people and normal people, which requires a certain degree of real-time and deploy-ability of the model.
no code implementations • 3 Jul 2022 • Qidan Zhu, Jing Li, Fei Yuan, Quan Gan
The sparse frame-level features are fused through the features obtained by the two designed branches as the reconstructed dense frame-level feature sequence, and the connectionist temporal classification(CTC) loss is used for training and optimization after the time-series feature extraction part.
no code implementations • 8 Apr 2022 • Qidan Zhu, Jing Li, Fei Yuan, Quan Gan
The time-wise feature extraction part performs temporal feature learning by first extracting temporal receptive field features of different scales using the proposed multi-scale temporal block (MST-block) to improve the temporal modeling capability, and then further encoding the temporal features of different scales by the transformers module to obtain more accurate temporal features.