Recent pretrained language models extend from millions to billions of parameters.
Vision-language pre-training (VLP) on large-scale image-text pairs has recently witnessed rapid progress for learning cross-modal representations.
Existing work in multilingual pretraining has demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages.
Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.
Conventional Knowledge Graph Completion (KGC) assumes that all test entities appear during training.
It consists of a generator to produce pun sentences, and a discriminator to distinguish between the generated pun sentences and the real sentences with specific word senses.
Therefore, we propose a generic and novel framework which consists of a sentiment analyzer and a sentimental generator, respectively addressing the two challenges.
To relieve these problems, we first propose force attention (FA) method to encourage the generator to pay more attention to the uncovered attributes to avoid potential key attributes missing.
In this way, we can reduce the dependence of the model on the label order, as well as capture high-order correlations between labels.
The task of unsupervised bilingual lexicon induction (UBLI) aims to induce word translations from monolingual corpora in two languages.
Automatic commenting of online articles can provide additional opinions and facts to the reader, which improves user experience and engagement on social media platforms.
Experiments show that with external commonsense knowledge and adversarial training, the generated essays are more novel, diverse, and topic-consistent than existing methods in terms of both automatic and human evaluation.
Unsupervised text style transfer aims to alter text styles while preserving the content, without aligned data for supervision.
Therefore, in this paper, we propose a dual reinforcement learning framework to directly transfer the style of the text via a one-step mapping model, without any separation of content and style.
Ranked #1 on Unsupervised Text Style Transfer on Yelp
The visual storytelling (VST) task aims at generating a reasonable and coherent paragraph-level story with the image stream as input.
In order to avoid such sophisticated alternate optimization, we propose to learn unsupervised word mapping by directly maximizing the mean discrepancy between the distribution of transferred embedding and target embedding.
The goal of Word Sense Disambiguation (WSD) is to identify the correct meaning of a word in the particular context.
GAS models the semantic relationship between the context and the gloss in an improved memory network framework, which breaks the barriers of the previous supervised methods and knowledge-based methods.
Ranked #3 on Word Sense Disambiguation on SemEval 2015 Task 13