Highway Layer

Introduced by Srivastava et al. in Highway Networks

A Highway Layer contains an information highway to other layers that helps with information flow. It is characterised by the use of a gating unit to help this information flow.

A plain feedforward neural network typically consists of $L$ layers where the $l$th layer ($l \in ${$1, 2, \dots, L$}) applies a nonlinear transform $H$ (parameterized by $\mathbf{W_{H,l}}$) on its input $\mathbf{x_{l}}$ to produce its output $\mathbf{y_{l}}$. Thus, $\mathbf{x_{1}}$ is the input to the network and $\mathbf{y_{L}}$ is the network’s output. Omitting the layer index and biases for clarity,

$$ \mathbf{y} = H\left(\mathbf{x},\mathbf{W_{H}}\right) $$

$H$ is usually an affine transform followed by a non-linear activation function, but in general it may take other forms.

For a highway network, we additionally define two nonlinear transforms $T\left(\mathbf{x},\mathbf{W_{T}}\right)$ and $C\left(\mathbf{x},\mathbf{W_{C}}\right)$ such that:

$$ \mathbf{y} = H\left(\mathbf{x},\mathbf{W_{H}}\right)·T\left(\mathbf{x},\mathbf{W_{T}}\right) + \mathbf{x}·C\left(\mathbf{x},\mathbf{W_{C}}\right)$$

We refer to T as the transform gate and C as the carry gate, since they express how much of the output is produced by transforming the input and carrying it, respectively. In the original paper, the authors set $C = 1 − T$, giving:

$$ \mathbf{y} = H\left(\mathbf{x},\mathbf{W_{H}}\right)·T\left(\mathbf{x},\mathbf{W_{T}}\right) + \mathbf{x}·\left(1-T\left(\mathbf{x},\mathbf{W_{T}}\right)\right)$$

The authors set:

$$ T\left(x\right) = \sigma\left(\mathbf{W_{T}}^{T}\mathbf{x} + \mathbf{b_{T}}\right) $$

Image: Sik-Ho Tsang

Source: Highway Networks

Latest Papers

PAPER DATE
Learning Speaker Embedding from Text-to-Speech
| Jaejin ChoPiotr ZelaskoJesus VillalbaShinji WatanabeNajim Dehak
2020-10-21
Grapheme or phoneme? An Analysis of Tacotron's Embedded Representations
Antoine PerquinErica CooperJunichi Yamagishi
2020-10-21
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan ShenYe JiaMike ChrzanowskiYu ZhangIsaac EliasHeiga ZenYonghui Wu
2020-10-08
Controllable neural text-to-speech synthesis using intuitive prosodic features
Tuomo RaitioRamya RasipuramDan Castellani
2020-09-14
Corrective feedback, emphatic speech synthesis, visual-speech exaggeration, pronunciation learning
Yaohua BuWeijun LiTianyi MaShengqi ChenJia JiaKun LiXiaobo Lu
2020-09-12
Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion
| Dipjyoti PaulMuhammed PV ShifasYannis PantazisYannis Stylianou
2020-08-13
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS
Rui LiuBerrak SismanFeilong BaoGuanglai GaoHaizhou Li
2020-08-11
SpeedySpeech: Efficient Neural Speech Synthesis
| Jan VainerOndřej Dušek
2020-08-09
One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
| Tomáš NekvindaOndřej Dušek
2020-08-03
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis
Yusuke YasudaXin WangJunichi Yamagishi
2020-05-20
End-To-End Speech Synthesis Applied to Brazilian Portuguese
| Edresson CasanovaArnaldo Candido JuniorChristopher ShulbyFrederico Santos de OliveiraJoão Paulo TeixeiraMoacir Antonelli PontiSandra Maria Aluisio
2020-05-11
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Yu GuXiang YinYonghui RaoYuan WanBenlai TangYang ZhangJitong ChenYuxuan WangZejun Ma
2020-04-23
Recurrent Highway Networks with Grouped Auxiliary Memory
| Wei Luo ; Feng Yu
2019-12-13
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis
Junjie PanXiang YinZhiling ZhangShichao LiuYang ZhangZejun MaYuxuan Wang
2019-11-11
Speech Recognition with Augmented Synthesized Speech
Andrew RosenbergYu ZhangBhuvana RamabhadranYe JiaPedro MorenoYonghui WuZelin Wu
2019-09-25
Graph-Partitioning-Based Diffusion Convolutional Recurrent Neural Network for Large-Scale Traffic Forecasting
| Tanwi MallickPrasanna BalaprakashEric RaskJane Macfarlane
2019-09-24
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
| Yu ZhangRon J. WeissHeiga ZenYonghui WuZhifeng ChenRJ Skerry-RyanYe JiaAndrew RosenbergBhuvana Ramabhadran
2019-07-09
A New GAN-based End-to-End TTS Training Algorithm
Haohan GuoFrank K. SoongLei HeLei Xie
2019-04-09
Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data
Roee Levy LeshemRaja Giryes
2019-04-06
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis
Yanyao BianChangbin ChenYongguo KangZhenglin Pan
2019-04-04
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang ZhangXin WangFuming FangHaizhou LiJunichi Yamagishi
2019-03-29
Analysing Dropout and Compounding Errors in Neural Language Models
James O' NeillDanushka Bollegala
2018-11-02
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
| Yusuke YasudaXin WangShinji TakakiJunichi Yamagishi
2018-10-29
Learning User Preferences and Understanding Calendar Contexts for Event Scheduling
| Donghyeon KimJinhyuk LeeDonghee ChoiJaehoon ChoiJaewoo Kang
2018-09-05
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An ChungYuxuan WangWei-Ning HsuYu ZhangRJ Skerry-Ryan
2018-08-30
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis
Daisy StantonYuxuan WangRJ Skerry-Ryan
2018-08-04
A Multi-Attention based Neural Network with External Knowledge for Story Ending Predicting Task
Qian LiZiwei LiJin-Mao WeiYanhui GuAdam JatowtZhenglu Yang
2018-08-01
Learning to Generate Word Representations using Subword Information
Yeachan KimKang-Min KimJi-Min LeeSangKeun Lee
2018-08-01
Voice Imitating Text-to-Speech Neural Networks
Younggun LeeTaesu KimSoo-Young Lee
2018-06-04
Hierarchical Attention-Based Recurrent Highway Networks for Time Series Prediction
| Yunzhe TaoLin MaWeizhong ZhangJian LiuWei LiuQiang Du
2018-06-02
Semi-supervised User Geolocation via Graph Convolutional Networks
| Afshin RahimiTrevor CohnTimothy Baldwin
2018-04-22
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
| RJ Skerry-RyanEric BattenbergYing XiaoYuxuan WangDaisy StantonJoel ShorRon J. WeissRob ClarkRif A. Saurous
2018-03-24
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
| Yuxuan WangDaisy StantonYu ZhangRJ Skerry-RyanEric BattenbergJoel ShorYing XiaoFei RenYe JiaRif A. Saurous
2018-03-23
LCANet: End-to-End Lipreading with Cascaded Attention-CTC
Kai XuDawei LiNick CassimatisXiaolong Wang
2018-03-13
Emotional End-to-End Neural Speech Synthesizer
| Younggun LeeAzam RabieeSoo-Young Lee
2017-11-15
Uncovering Latent Style Factors for Expressive Speech Synthesis
Yuxuan WangRJ Skerry-RyanYing XiaoDaisy StantonJoel ShorEric BattenbergRob ClarkRif A. Saurous
2017-11-01
Deep Voice 2: Multi-Speaker Neural Text-to-Speech
Sercan ArikGregory DiamosAndrew GibianskyJohn MillerKainan PengWei PingJonathan RaimanYanqi Zhou
2017-05-24
Tacotron: Towards End-to-End Speech Synthesis
| Yuxuan WangRJ Skerry-RyanDaisy StantonYonghui WuRon J. WeissNavdeep JaitlyZongheng YangYing XiaoZhifeng ChenSamy BengioQuoc LeYannis AgiomyrgiannakisRob ClarkRif A. Saurous
2017-03-29
Representations of language in a model of visually grounded speech signal
| Grzegorz ChrupałaLieke GelderloosAfra Alishahi
2017-02-07
Learning text representation using recurrent convolutional neural network with highway layers
Ying WenWeinan ZhangRui LuoJun Wang
2016-06-22
Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition
Liang LuSteve Renals
2015-12-14
Character-Aware Neural Language Models
| Yoon KimYacine JerniteDavid SontagAlexander M. Rush
2015-08-26
Highway Networks
| Rupesh Kumar SrivastavaKlaus GreffJürgen Schmidhuber
2015-05-03

Tasks

Categories