Tree Transformer: Integrating Tree Structures into Self-Attention

IJCNLP 2019 Yau-Shian WangHung-Yi LeeYun-Nung Chen

Pre-training Transformer from large-scale raw texts and fine-tuning on the desired task have achieved state-of-the-art results on diverse NLP tasks. However, it is unclear what the learned attention captures... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper