Since the advent of Federated Learning (FL), research has applied these methods to natural language processing (NLP) tasks.
We find that there is a simple heuristic for when to use one of these techniques over the other: pairwise MTL is better than STILTs when the target task has fewer instances than the supporting task and vice versa.
Code switching (CS) refers to the phenomenon of interchangeably using words and phrases from different languages.
Text classification is a significant branch of natural language processing, and has many applications including document classification and sentiment analysis.
However, all previous work has only looked at this problem from the consecutive perspective, leaving uncertainty on whether these approaches are effective in the more challenging streaming setting.
Understanding and identifying humor has been increasingly popular, as seen by the number of datasets created to study humor.
Predicting reading time has been a subject of much previous work, focusing on how different words affect human processing, measured by reading time.
These experiments show that this method outperforms all previous work done on these tasks, with an F-measure of 93. 1% for the Puns dataset and 98. 6% on the Short Jokes dataset.