Search Results for author: Feng-Ju Chang

Found 17 papers, 3 papers with code

Dialog act guided contextual adapter for personalized speech recognition

no code implementations31 Mar 2023 Feng-Ju Chang, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Kai Wei, Grant P. Strimel, Ross McGowan

Specifically, it leverages dialog acts to select the most relevant user catalogs and creates queries based on both -- the audio as well as the semantic relationship between the carrier phrase and user catalogs to better guide the contextual biasing.

Automatic Speech Recognition speech-recognition +1

ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

no code implementations29 Sep 2022 Martin Radfar, Rohit Barnwal, Rupak Vignesh Swaminathan, Feng-Ju Chang, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

Very recently, as an alternative to LSTM layers, the Conformer architecture was introduced where the encoder of RNN-T is replaced with a modified Transformer encoder composed of convolutional layers at the frontend and between attention layers.

speech-recognition Speech Recognition

Compute Cost Amortized Transformer for Streaming ASR

no code implementations5 Jul 2022 Yi Xie, Jonathan Macoskey, Martin Radfar, Feng-Ju Chang, Brian King, Ariya Rastrow, Athanasios Mouchtaris, Grant P. Strimel

We present a streaming, Transformer-based end-to-end automatic speech recognition (ASR) architecture which achieves efficient neural inference through compute cost amortization.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding

no code implementations1 Apr 2022 Xuandi Fu, Feng-Ju Chang, Martin Radfar, Kai Wei, Jing Liu, Grant P. Strimel, Kanthashree Mysore Sathyendra

In addition, the NLU model in the two-stage system is not streamable, as it must wait for the audio segments to complete processing, which ultimately impacts the latency of the SLU system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Multi-Channel Transformer Transducer for Speech Recognition

no code implementations30 Aug 2021 Feng-Ju Chang, Martin Radfar, Athanasios Mouchtaris, Maurizio Omologo

In this paper, we present a novel speech recognition model, Multi-Channel Transformer Transducer (MCTT), which features end-to-end multi-channel training, low computation cost, and low latency so that it is suitable for streaming decoding in on-device speech recognition.

speech-recognition Speech Recognition

End-to-End Multi-Channel Transformer for Speech Recognition

no code implementations8 Feb 2021 Feng-Ju Chang, Martin Radfar, Athanasios Mouchtaris, Brian King, Siegfried Kunzmann

Transformers are powerful neural architectures that allow integrating different modalities using attention mechanisms.

Decoder speech-recognition +1

Pose-variant 3D Facial Attribute Generation

no code implementations24 Jul 2019 Feng-Ju Chang, Xiang Yu, Ram Nevatia, Manmohan Chandraker

We address the challenging problem of generating facial attributes using a single image in an unconstrained pose.

3D Reconstruction Attribute +1

ExpNet: Landmark-Free, Deep, 3D Facial Expressions

1 code implementation2 Feb 2018 Feng-Ju Chang, Anh Tuan Tran, Tal Hassner, Iacopo Masi, Ram Nevatia, Gerard Medioni

Our ExpNet CNN is applied directly to the intensities of a face image and regresses a 29D vector of 3D expression coefficients.

 Ranked #1 on 3D Facial Expression Recognition on 2017_test set (using extra training data)

3D Face Reconstruction 3D Facial Expression Recognition +2

FacePoseNet: Making a Case for Landmark-Free Face Alignment

6 code implementations24 Aug 2017 Feng-Ju Chang, Anh Tuan Tran, Tal Hassner, Iacopo Masi, Ram Nevatia, Gerard Medioni

Instead, we compare our FPN with existing methods by evaluating how they affect face recognition accuracy on the IJB-A and IJB-B benchmarks: using the same recognition pipeline, but varying the face alignment method.

 Ranked #1 on Facial Landmark Detection on 300W (Mean Error Rate metric)

3D Face Alignment Face Alignment +4

Multiple Structured-Instance Learning for Semantic Segmentation with Uncertain Training Data

no code implementations CVPR 2014 Feng-Ju Chang, Yen-Yu Lin, Kuang-Jui Hsu

By treating a bounding box as a bag with its segment hypotheses as structured instances, MSIL-CRF selects the most likely segment hypotheses by leveraging the knowledge derived from both the labeled and uncertain training data.

Multiple Instance Learning Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.