Most graph-to-text works are built on the encoder-decoder framework with cross-attention mechanism.
The core objective of modelling recommender systems from implicit feedback is to maximize the positive sample score $s_p$ and minimize the negative sample score $s_n$, which can usually be summarized into two paradigms: the pointwise and the pairwise.
However, it is hard for a vanilla encoder to capture these.
Ranked #1 on Table-to-Text Generation on RotoWire
Secondly, the target texts in training dataset may contain redundant information or facts do not exist in the input tables.
Accurate detection of multi-oriented text with large variations of scales, orientations, and aspect ratios is of great significance.