Search Results for author: Bertram E. Shi

Found 24 papers, 6 papers with code

Estimating disparity with confidence from energy neurons

no code implementations NeurIPS 2007 Eric K. Tsang, Bertram E. Shi

When optimized for natural images, it yields a feature that can be explained by the normalization which is a common model in V1 neurons.

Disparity Estimation

Extending Phase Mechanism to Differential Motion Opponency for Motion Pop-out

no code implementations NeurIPS 2009 Yicong Meng, Bertram E. Shi

We extend the concept of phase tuning, a ubiquitous mechanism in sensory neurons including motion and disparity detection neurons, to the motion contrast detection.

Intrinsically Motivated Learning of Visual Motion Perception and Smooth Pursuit

no code implementations14 Feb 2014 Chong Zhang, Yu Zhao, Jochen Triesch, Bertram E. Shi

We extend the framework of efficient coding, which has been used to model the development of sensory processing in isolation, to model the development of the perception/action cycle.

Reinforcement Learning (RL)

Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks

no code implementations ICCV 2015 Lin Sun, Kui Jia, Dit-yan Yeung, Bertram E. Shi

Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects.

Action Recognition Image Classification +1

Invariant feature extraction from event based stimuli

no code implementations15 Apr 2016 Thusitha N. Chandrapala, Bertram E. Shi

The framework is inspired by feed-forward cortical models for visual processing.

Object Recognition

An active efficient coding model of the optokinetic nystagmus

no code implementations21 Jun 2016 Chong Zhang, Jochen Triesch, Bertram E. Shi

This framework models the joint emergence of both perception and behavior, and accounts for the importance of the development of normal vergence control and binocular vision in achieving normal monocular OKN (mOKN) behaviors.

Lattice Long Short-Term Memory for Human Action Recognition

no code implementations ICCV 2017 Lin Sun, Kui Jia, Kevin Chen, Dit Yan Yeung, Bertram E. Shi, Silvio Savarese

This method effectively enhances the ability to model dynamics across time and addresses the non-stationary issue of long-term motion dynamics without significantly increasing the model complexity.

Action Recognition Optical Flow Estimation +1

Multimodal Utterance-level Affect Analysis using Visual, Audio and Text Features

2 code implementations2 May 2018 Didan Deng, Yuqian Zhou, Jimin Pi, Bertram E. Shi

The integration of information across multiple modalities and across time is a promising way to enhance the emotion recognition performance of affective systems.

Emotion Recognition

Coupled Recurrent Network (CRN)

no code implementations25 Dec 2018 Lin Sun, Kui Jia, Yuejia Shen, Silvio Savarese, Dit Yan Yeung, Bertram E. Shi

To learn from these heterogenous input sources, existing methods reply on two-stream architectural designs that contain independent, parallel streams of Recurrent Neural Networks (RNNs).

Action Recognition In Videos Multi-Person Pose Estimation +2

Appearance-Based Gaze Estimation Using Dilated-Convolutions

no code implementations18 Mar 2019 Zhaokang Chen, Bertram E. Shi

Appearance-based gaze estimation has attracted more and more attention because of its wide range of applications.

Contact Detection Gaze Estimation

Gaze Training by Modulated Dropout Improves Imitation Learning

no code implementations17 Apr 2019 Yuying Chen, Congcong Liu, Lei Tai, Ming Liu, Bertram E. Shi

The basic idea behind behavioral cloning is to have the neural network learn from observing a human expert's behavior.

Autonomous Driving Imitation Learning

Offset Calibration for Appearance-Based Gaze Estimation via Gaze Decomposition

no code implementations11 May 2019 Zhaokang Chen, Bertram E. Shi

To improve estimation, we propose a novel gaze decomposition method and a single gaze point calibration method, motivated by our finding that the inter-subject squared bias exceeds the intra-subject variance for a subject-independent estimator.

Gaze Estimation

AVGCN: Trajectory Prediction using Graph Convolutional Networks Guided by Human Attention

no code implementations14 Jan 2021 Congcong Liu, Yuying Chen, Ming Liu, Bertram E. Shi

We suggest that introducing an attention mechanism to infer the importance of different neighbors is critical for accurate trajectory prediction in scenes with varying crowd size.

Pedestrian Trajectory Prediction Trajectory Prediction

Self-Calibrating Active Binocular Vision via Active Efficient Coding with Deep Autoencoders

no code implementations27 Jan 2021 Charles Wilmot, Bertram E. Shi, Jochen Triesch

We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements.

Iterative Distillation for Better Uncertainty Estimates in Multitask Emotion Recognition

no code implementations21 Jul 2021 Didan Deng, Liang Wu, Bertram E. Shi

Iterative distillation over multiple generations significantly improves performance in both emotion recognition and uncertainty estimation.

Emotion Recognition

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

2 code implementations LREC 2022 Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Xu Yan, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

ASCEND (A Spontaneous Chinese-English Dataset) is a high-quality Mandarin Chinese-English code-switching corpus built on spontaneous multi-turn conversational dialogue sources collected in Hong Kong.

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset

1 code implementation LREC 2022 Tiezheng Yu, Rita Frieske, Peng Xu, Samuel Cahyawijaya, Cheuk Tung Shadow Yiu, Holy Lovenia, Wenliang Dai, Elham J. Barezi, Qifeng Chen, Xiaojuan Ma, Bertram E. Shi, Pascale Fung

We further conduct experiments with Fairseq S2T Transformer, a state-of-the-art ASR model, on the biggest existing dataset, Common Voice zh-HK, and our proposed MDCC, and the results show the effectiveness of our dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Hallucinations in Neural Automatic Speech Recognition: Identifying Errors and Hallucinatory Models

no code implementations3 Jan 2024 Rita Frieske, Bertram E. Shi

To address this, we propose a perturbation-based method for assessing the susceptibility of an automatic speech recognition (ASR) model to hallucination at test time, which does not require access to the training dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.