1 code implementation • ECCV 2020 • Zhe Niu, Brian Mak
In this paper, we propose novel stochastic modeling of various components of a continuous sign language recognition (CSLR) system that is based on the transformer encoder and connectionist temporal classification (CTC).
1 code implementation • CVPR 2023 • Ronglai Zuo, Fangyun Wei, Brian Mak
Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth.
Ranked #1 on
Sign Language Recognition
on WLASL-2000
no code implementations • ICCV 2023 • Zhe Niu, Brian Mak
Most lip-to-speech (LTS) synthesis models are trained and evaluated under the assumption that the audio-video pairs in the dataset are perfectly synchronized.
no code implementations • 26 Dec 2022 • Ronglai Zuo, Brian Mak
We name the CSLR model trained with the above auxiliary tasks as consistency-enhanced CSLR, which performs well on signer-dependent datasets in which all signers appear during both training and testing.
Ranked #7 on
Sign Language Recognition
on CSL-Daily
1 code implementation • 2 Nov 2022 • Yutong Chen, Ronglai Zuo, Fangyun Wei, Yu Wu, Shujie Liu, Brian Mak
RGB videos, however, are raw signals with substantial visual redundancy, leading the encoder to overlook the key information for sign language understanding.
no code implementations • CVPR 2022 • Ronglai Zuo, Brian Mak
The backbone of most deep-learning-based continuous sign language recognition (CSLR) models consists of a visual module, a sequential module, and an alignment module.
no code implementations • 19 Aug 2020 • Wei Li, Brian Mak
One of the current state-of-the-art multilingual document embedding model LASER is based on the bidirectional LSTM neural machine translation model.
no code implementations • 26 Nov 2019 • Zhaoyu Liu, Brian Mak
Speaker similarity is good for native speech from native speakers.
no code implementations • 29 Jul 2018 • Wei Li, Brian Mak
This paper further adds a distance constraint to the training objective function of NV so that the two embeddings of a parallel document are required to be as close as possible.
Cross-Lingual Document Classification
Document Classification
+6
no code implementations • EACL 2017 • Wei Li, Brian Mak
In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector.