Search Results for author: Bertram E. Shi

Found 24 papers, 6 papers with code

Estimating disparity with confidence from energy neurons

no code implementations • NeurIPS 2007 • Eric K. Tsang, Bertram E. Shi

When optimized for natural images, it yields a feature that can be explained by the normalization which is a common model in V1 neurons.

Disparity Estimation

Paper
Add Code

Extending Phase Mechanism to Differential Motion Opponency for Motion Pop-out

no code implementations • NeurIPS 2009 • Yicong Meng, Bertram E. Shi

We extend the concept of phase tuning, a ubiquitous mechanism in sensory neurons including motion and disparity detection neurons, to the motion contrast detection.

Paper
Add Code

Intrinsically Motivated Learning of Visual Motion Perception and Smooth Pursuit

no code implementations • 14 Feb 2014 • Chong Zhang, Yu Zhao, Jochen Triesch, Bertram E. Shi

We extend the framework of efficient coding, which has been used to model the development of sensory processing in isolation, to model the development of the perception/action cycle.

Reinforcement Learning (RL)

Paper
Add Code

Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks

no code implementations • ICCV 2015 • Lin Sun, Kui Jia, Dit-yan Yeung, Bertram E. Shi

Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects.

Action Recognition Image Classification +1

Paper
Add Code

Invariant feature extraction from event based stimuli

no code implementations • 15 Apr 2016 • Thusitha N. Chandrapala, Bertram E. Shi

The framework is inspired by feed-forward cortical models for visual processing.

Object Recognition

Paper
Add Code

An active efficient coding model of the optokinetic nystagmus

no code implementations • 21 Jun 2016 • Chong Zhang, Jochen Triesch, Bertram E. Shi

This framework models the joint emergence of both perception and behavior, and accounts for the importance of the development of normal vergence control and binocular vision in achieving normal monocular OKN (mOKN) behaviors.

Paper
Add Code

Lattice Long Short-Term Memory for Human Action Recognition

no code implementations • ICCV 2017 • Lin Sun, Kui Jia, Kevin Chen, Dit Yan Yeung, Bertram E. Shi, Silvio Savarese

This method effectively enhances the ability to model dynamics across time and addresses the non-stationary issue of long-term motion dynamics without significantly increasing the model complexity.

Action Recognition Optical Flow Estimation +1

Paper
Add Code

Multimodal Utterance-level Affect Analysis using Visual, Audio and Text Features

2 code implementations • 2 May 2018 • Didan Deng, Yuqian Zhou, Jimin Pi, Bertram E. Shi

The integration of information across multiple modalities and across time is a promising way to enhance the emotion recognition performance of affective systems.

Emotion Recognition

Paper
Code

Coupled Recurrent Network (CRN)

no code implementations • 25 Dec 2018 • Lin Sun, Kui Jia, Yuejia Shen, Silvio Savarese, Dit Yan Yeung, Bertram E. Shi

To learn from these heterogenous input sources, existing methods reply on two-stream architectural designs that contain independent, parallel streams of Recurrent Neural Networks (RNNs).

Action Recognition In Videos Multi-Person Pose Estimation +2

Paper
Add Code

Appearance-Based Gaze Estimation Using Dilated-Convolutions

no code implementations • 18 Mar 2019 • Zhaokang Chen, Bertram E. Shi

Appearance-based gaze estimation has attracted more and more attention because of its wide range of applications.

Contact Detection Gaze Estimation

Paper
Add Code

Gaze Training by Modulated Dropout Improves Imitation Learning

no code implementations • 17 Apr 2019 • Yuying Chen, Congcong Liu, Lei Tai, Ming Liu, Bertram E. Shi

The basic idea behind behavioral cloning is to have the neural network learn from observing a human expert's behavior.

Autonomous Driving Imitation Learning

Paper
Add Code

Offset Calibration for Appearance-Based Gaze Estimation via Gaze Decomposition

no code implementations • 11 May 2019 • Zhaokang Chen, Bertram E. Shi

To improve estimation, we propose a novel gaze decomposition method and a single gaze point calibration method, motivated by our finding that the inter-subject squared bias exceeds the intra-subject variance for a subject-independent estimator.

Gaze Estimation

Paper
Add Code

Robot Navigation in Crowds by Graph Convolutional Networks with Attention Learned from Human Gaze

no code implementations • 23 Sep 2019 • Yuying Chen, Congcong Liu, Ming Liu, Bertram E. Shi

Previous work has shown the power of deep reinforcement learning frameworks to train efficient policies.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Towards High Performance Low Complexity Calibration in Appearance Based Gaze Estimation

2 code implementations • 25 Jan 2020 • Zhaokang Chen, Bertram E. Shi

Appearance-based gaze estimation from RGB images provides relatively unconstrained gaze tracking.

Gaze Estimation Vocal Bursts Intensity Prediction

Paper
Code

Multitask Emotion Recognition with Incomplete Labels

3 code implementations • 10 Feb 2020 • Didan Deng, Zhaokang Chen, Bertram E. Shi

We use the soft labels and the ground truth to train the student model.

Action Unit Detection Arousal Estimation +2

110

Paper
Code

HGCN-GJS: Hierarchical Graph Convolutional Network with Groupwise Joint Sampling for Trajectory Prediction

no code implementations • 15 Sep 2020 • Yuying Chen, Congcong Liu, Xiaodong Mei, Bertram E. Shi, Ming Liu

Fully investigating the social interactions within the crowd is crucial for accurate pedestrian trajectory prediction.

Autonomous Driving Pedestrian Trajectory Prediction +2

Paper
Add Code

AVGCN: Trajectory Prediction using Graph Convolutional Networks Guided by Human Attention

no code implementations • 14 Jan 2021 • Congcong Liu, Yuying Chen, Ming Liu, Bertram E. Shi

We suggest that introducing an attention mechanism to infer the importance of different neighbors is critical for accurate trajectory prediction in scenes with varying crowd size.

Pedestrian Trajectory Prediction Trajectory Prediction

Paper
Add Code

Self-Calibrating Active Binocular Vision via Active Efficient Coding with Deep Autoencoders

no code implementations • 27 Jan 2021 • Charles Wilmot, Bertram E. Shi, Jochen Triesch

We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements.

Paper
Add Code

Learning Hierarchical Integration of Foveal and Peripheral Vision for Vergence Control by Active Efficient Coding

no code implementations • 29 Jan 2021 • Zhetuo Zhao, Jochen Triesch, Bertram E. Shi

The high resolution fovea can drive precise short range movements.

Paper
Add Code

Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition

no code implementations • 21 Jul 2021 • Didan Deng, Liang Wu, Bertram E. Shi

Iterative distillation over multiple generations significantly improves performance in both emotion recognition and uncertainty estimation.

Emotion Recognition

Paper
Add Code

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

2 code implementations • LREC 2022 • Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Xu Yan, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus built on spontaneous multi-turn conversational dialogue sources collected in Hong Kong.

122

Paper
Code

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

1 code implementation • LREC 2022 • Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition

1 code implementation • 11 Jan 2022 • Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J. Barezi, Peng Xu, Cheuk Tung Shadow Yiu, Rita Frieske, Holy Lovenia, Genta Indra Winata, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

With the rise of deep learning and intelligent vehicle, the smart assistant has become an essential in-car component to facilitate driving and provide extra functionalities.

Audio-Visual Speech Recognition speech-recognition +1

Paper
Code

Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models

no code implementations • 3 Jan 2024 • Rita Frieske, Bertram E. Shi

To address this, we propose a perturbation-based method for assessing the susceptibility of an automatic speech recognition (ASR) model to hallucination at test time, which does not require access to the training dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.