Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities.
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.
Ranked #2 on
Speech Emotion Recognition
on LSSED
Community Question-Answering websites, such as StackOverflow and Quora, expect users to follow specific guidelines in order to maintain content quality.
Ranked #1 on
Question Quality Assessment
on CrowdSource QA
We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.
Compared with a SOTA model finetuned on more than >28k data points, DePlot+LLM with just one-shot prompting achieves a 24. 0% improvement over finetuned SOTA on human-written queries from the task of chart QA.
Ranked #1 on
Chart Question Answering
on ChartQA
Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks.
Our proposed framework provides access: (i) for verifying whether automatic metrics are faithful to human preference, regardless of their correlation level to human; and (ii) for inspecting the strengths and limitations of NLG systems via pairwise evaluation.
On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.
Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing.
Ranked #1 on
Split and Rephrase
on WikiSplit
Here we present a general fine-tuning method that we call information gain filtration for improving the overall training efficiency and final performance of language model fine-tuning.