1 code implementation • 5 May 2022 • Shaojie Jiang, Ruqing Zhang, Svitlana Vakulenko, Maarten de Rijke
The cross-entropy objective has proved to be an all-purpose training objective for autoregressive language models (LMs).
1 code implementation • 26 Mar 2020 • Shaojie Jiang, Thomas Wolf, Christof Monz, Maarten de Rijke
We hypothesize that the deeper reason is that in the training corpora, there are hard tokens that are more difficult for a generative model to learn than others and, once learning has finished, hard tokens are still under-learned, so that repetitive generations are more likely to happen.
2 code implementations • 25 Feb 2019 • Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke
Specifically, we first analyze the influence of the commonly used Cross-Entropy (CE) loss function, and find that the CE loss function prefers high-frequency tokens, which results in low-diversity responses.
no code implementations • WS 2018 • Shaojie Jiang, Maarten de Rijke
Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation.
no code implementations • CVPR 2016 • Jifeng Ning, Jimei Yang, Shaojie Jiang, Lei Zhang, Ming-Hsuan Yang
Structured support vector machine (SSVM) based methods has demonstrated encouraging performance in recent object tracking benchmarks.