Dependency Parsers

Baidu Dependency Parser

Introduced by Zhang et al. in A Practical Chinese Dependency Parser Based on A Large-scale Dataset

DDParser, or Baidu Dependency Parser, is a Chinese dependency parser trained on a large-scale manually labeled dataset called Baidu Chinese Treebank (DuCTB).

For inputs, for the $i$ th word, its input vector $e_{i}$ is the concatenation of the word embedding and character-level representation:

$$ e_{i}=e_{i}^{w o r d} \oplus C h a r L S T M\left(w_{i}\right) $$

Where $\operatorname{CharLSTM}\left(w_{i}\right)$ is the output vectors after feeding the character sequence into a BiLSTM layer. The experimental results on DuCTB dataset show that replacing POS tag embeddings with $\operatorname{CharLSTM}\left(w_{i}\right)$ leads to the improvement.

For the BiLSTM encoder, three BiLSTM layers are employed over the input vectors for context encoding. Denote $r_{i}$ the output vector of the top-layer BiLSTM for $w_{i}$

The dependency parser of Dozat and Manning is used. Dimension-reducing MLPs are applied to each recurrent output vector $r_{i}$ before applying the biaffine transformation. Applying smaller MLPs to the recurrent output states before the biaffine classifier has the advantage of stripping away information not relevant to the current decision. Then biaffine attention is used both in the dependency arc classifier and relation classifier. The computations of all symbols in the Figure are shown below:

$$ h_{i}^{d-a r c}=M L P^{d-a r c}\left(r_{i}\right) $$ $$ h_{i}^{h-a r c}=M L P^{h-a r c}\left(r_{i}\right) \ $$ $$ h_{i}^{d-r e l}=M L P^{d-r e l}\left(r_{i}\right) \ $$ $$ h_{i}^{h-r e l}=M L P^{h-r e l}\left(r_{i}\right) \ $$ $$ S^{a r c}=\left(H^{d-a r c} \oplus I\right) U^{a r c} H^{h-a r c} \ $$ $$ S^{r e l}=\left(H^{d-r e l} \oplus I\right) U^{r e l}\left(\left(H^{h-r e l}\right)^{T} \oplus I\right)^{T} $$

For the decoder, the first-order Eisner algorithm is used to ensure that the output is a projection tree. Based on the dependency tree built by biaffine parser, we get a word sequence through the in-order traversal of the tree. The output is a projection tree only if the word sequence is in order.

Source: A Practical Chinese Dependency Parser Based on A Large-scale Dataset

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Dependency Parsing 1 100.00%

Components


Component Type
BiLSTM
Deep Tabular Learning

Categories