1 code implementation • 11 Jan 2022 • Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J. Barezi, Peng Xu, Cheuk Tung Shadow Yiu, Rita Frieske, Holy Lovenia, Genta Indra Winata, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung
With the rise of deep learning and intelligent vehicle, the smart assistant has become an essential in-car component to facilitate driving and provide extra functionalities.
1 code implementation • LREC 2022 • Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung
We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • LREC 2022 • Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Xu Yan, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung
ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus built on spontaneous multi-turn conversational dialogue sources collected in Hong Kong.
no code implementations • 21 Jul 2021 • Didan Deng, Liang Wu, Bertram E. Shi
Iterative distillation over multiple generations significantly improves performance in both emotion recognition and uncertainty estimation.
no code implementations • 29 Jan 2021 • Zhetuo Zhao, Jochen Triesch, Bertram E. Shi
The high resolution fovea can drive precise short range movements.
no code implementations • 27 Jan 2021 • Charles Wilmot, Bertram E. Shi, Jochen Triesch
We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements.
no code implementations • 14 Jan 2021 • Congcong Liu, Yuying Chen, Ming Liu, Bertram E. Shi
We suggest that introducing an attention mechanism to infer the importance of different neighbors is critical for accurate trajectory prediction in scenes with varying crowd size.
no code implementations • 15 Sep 2020 • Yuying Chen, Congcong Liu, Bertram E. Shi, Ming Liu
Fully investigating the social interactions within the crowd is crucial for accurate pedestrian trajectory prediction.
3 code implementations • 10 Feb 2020 • Didan Deng, Zhaokang Chen, Bertram E. Shi
We use the soft labels and the ground truth to train the student model.
2 code implementations • 25 Jan 2020 • Zhaokang Chen, Bertram E. Shi
Appearance-based gaze estimation from RGB images provides relatively unconstrained gaze tracking.
no code implementations • 23 Sep 2019 • Yuying Chen, Congcong Liu, Ming Liu, Bertram E. Shi
Previous work has shown the power of deep reinforcement learning frameworks to train efficient policies.
no code implementations • 11 May 2019 • Zhaokang Chen, Bertram E. Shi
To improve estimation, we propose a novel gaze decomposition method and a single gaze point calibration method, motivated by our finding that the inter-subject squared bias exceeds the intra-subject variance for a subject-independent estimator.
no code implementations • 17 Apr 2019 • Yuying Chen, Congcong Liu, Lei Tai, Ming Liu, Bertram E. Shi
The basic idea behind behavioral cloning is to have the neural network learn from observing a human expert's behavior.
no code implementations • 18 Mar 2019 • Zhaokang Chen, Bertram E. Shi
Appearance-based gaze estimation has attracted more and more attention because of its wide range of applications.
no code implementations • 25 Dec 2018 • Lin Sun, Kui Jia, Yuejia Shen, Silvio Savarese, Dit Yan Yeung, Bertram E. Shi
To learn from these heterogenous input sources, existing methods reply on two-stream architectural designs that contain independent, parallel streams of Recurrent Neural Networks (RNNs).
Action Recognition In Videos
Multi-Person Pose Estimation
+2
2 code implementations • 2 May 2018 • Didan Deng, Yuqian Zhou, Jimin Pi, Bertram E. Shi
The integration of information across multiple modalities and across time is a promising way to enhance the emotion recognition performance of affective systems.
no code implementations • ICCV 2017 • Lin Sun, Kui Jia, Kevin Chen, Dit Yan Yeung, Bertram E. Shi, Silvio Savarese
This method effectively enhances the ability to model dynamics across time and addresses the non-stationary issue of long-term motion dynamics without significantly increasing the model complexity.
no code implementations • 21 Jun 2016 • Chong Zhang, Jochen Triesch, Bertram E. Shi
This framework models the joint emergence of both perception and behavior, and accounts for the importance of the development of normal vergence control and binocular vision in achieving normal monocular OKN (mOKN) behaviors.
no code implementations • 15 Apr 2016 • Thusitha N. Chandrapala, Bertram E. Shi
The framework is inspired by feed-forward cortical models for visual processing.
no code implementations • ICCV 2015 • Lin Sun, Kui Jia, Dit-yan Yeung, Bertram E. Shi
Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects.
no code implementations • 14 Feb 2014 • Chong Zhang, Yu Zhao, Jochen Triesch, Bertram E. Shi
We extend the framework of efficient coding, which has been used to model the development of sensory processing in isolation, to model the development of the perception/action cycle.
no code implementations • NeurIPS 2009 • Yicong Meng, Bertram E. Shi
We extend the concept of phase tuning, a ubiquitous mechanism in sensory neurons including motion and disparity detection neurons, to the motion contrast detection.
no code implementations • NeurIPS 2007 • Eric K. Tsang, Bertram E. Shi
When optimized for natural images, it yields a feature that can be explained by the normalization which is a common model in V1 neurons.