Search Results for author: Huijia Wu

Found 5 papers, 1 papers with code

HyperMoE: Paying Attention to Unselected Experts in Mixture of Experts via Dynamic Transfer

1 code implementation20 Feb 2024 Hao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu

The Mixture of Experts (MoE) for language models has been proven effective in augmenting the capacity of models by dynamically routing each input token to a specific subset of experts for processing.

Multi-Task Learning

Shortcut Sequence Tagging

no code implementations3 Jan 2017 Huijia Wu, Jiajun Zhang, Cheng-qing Zong

To simply the stacked architecture, we propose a framework called shortcut block, which is a marriage of the gating mechanism and shortcuts, while discarding the self-connected part in LSTM cell.

POS POS Tagging

An Empirical Exploration of Skip Connections for Sequential Tagging

no code implementations COLING 2016 Huijia Wu, Jiajun Zhang, Cheng-qing Zong

In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging.

CCG Supertagging POS +1

A Dynamic Window Neural Network for CCG Supertagging

no code implementations10 Oct 2016 Huijia Wu, Jiajun Zhang, Cheng-qing Zong

These motivate us to build a supertagger with a dynamic window approach, which can be treated as an attention mechanism on the local contexts.

CCG Supertagging Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.