In this paper, we investigate the unified ABSA task from the perspective of Machine Reading Comprehension (MRC) by observing that the aspect and the opinion terms can serve as the query and answer in MRC interchangeably.
In practice, a set of formulaic alphas is often used together for better modeling precision, so we need to find synergistic formulaic alpha sets that work well together.
In order to represent the facts happening in a specific time, temporal knowledge graph (TKG) embedding approaches are put forward.
Ranked #1 on Link Prediction on GDELT
However, both strategies are faced with some immediate problems: raw features cannot represent various properties of nodes (e. g., structure information), and representations learned by supervised GNN may suffer from the poor performance of the classifier on the poisoned graph.
In this work, we define the selective fairness task, where users can flexibly choose which sensitive attributes should the recommendation model be bias-free.
In this work, we highlight that the user's historical dialogue sessions and look-alike users are essential sources of user preferences besides the current dialogue session in CRS.
Ranked #3 on Recommendation Systems on ReDial
We argue that MBR models should: (1) model the coarse-grained commonalities between different behaviors of a user, (2) consider both individual sequence view and global graph view in multi-behavior modeling, and (3) capture the fine-grained differences between multiple behaviors of a user.
The proposed model encodes the textual information in queries, documents and the KG with multilingual BERT, and incorporates the KG information in the query-document matching process with a hierarchical information fusion mechanism.
Constrained Reinforcement Learning (CRL) burgeons broad interest in recent years, which pursues both goals of maximizing long-term returns and constraining costs.
A long-standing issue with paraphrase generation is how to obtain reliable supervision signals.
The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level.
The delayed feedback problem is one of the imperative challenges in online advertising, which is caused by the highly diversified feedback delay of a conversion varying from a few minutes to several days.
To build up a benchmark for this problem, we publicize a large-scale dataset named PENS (PErsonalized News headlineS).
The proposed method is efficient as it can make decisions on-the-fly by utilizing only one randomly chosen model, but is also effective as we show that it can be viewed as a non-Bayesian approximation of Thompson sampling.
Recent pretraining models in Chinese neglect two important aspects specific to the Chinese language: glyph and pinyin, which carry significant syntax and semantic information for language understanding.
In this paper, we propose a novel iterative network pruning with uncertainty regularization method for lifelong sentiment classification (IPRLS), which leverages the principles of network pruning and weight regularization.
The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive.
Finally, we integrate the imaginary concepts and relational knowledge to generate human-like story based on the original semantics of images.
Ranked #2 on Visual Storytelling on VIST
The proposed framework is based on the core idea that the meaning of a sentence should be defined by its contexts, and that sentence similarity can be measured by comparing the probabilities of generating two sentences given the same context.
Graph-based fraud detection approaches have escalated lots of attention recently due to the abundant relational information of graph-structured data, which may be beneficial for the detection of fraudsters.
Ranked #3 on Node Classification on Amazon-Fraud
Multi-source neural machine translation aims to translate from parallel sources of information (e. g. languages, images, etc.)
In this paper, we propose an Interactive key-value Memory- augmented Attention model for image Paragraph captioning (IMAP) to keep track of the attention history (salient objects coverage information) along with the update-chain of the decoder state and therefore avoid generating repetitive or incomplete image descriptions.
Named entity recognition (NER) is highly sensitive to sentential syntactic and semantic properties where entities may be extracted according to how they are used and placed in the running text.
Ranked #3 on Named Entity Recognition (NER) on WNUT 2016
We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
Chinese word segmentation (CWS) and part-of-speech (POS) tagging are important fundamental tasks for Chinese language processing, where joint learning of them is an effective one-step solution for both tasks.
In this paper, we aim to improve ATSA by discovering the potential aspect terms of the predicted sentiment polarity when the aspect terms of a test sentence are unknown.
Aspect-based sentiment analysis (ABSA) has attracted increasing attention recently due to its broad applications.
Ranked #3 on Aspect-Based Sentiment Analysis (ABSA) on MAMS
In this work, we re-examine the problem of extractive text summarization for long documents.
Ranked #8 on Extractive Text Summarization on CNN / Daily Mail
Aspect extraction relies on identifying aspects by discovering coherence among words, which is challenging when word meanings are diversified and processing on short texts.
It is often observed that the probabilistic predictions given by a machine learning model can disagree with averaged actual outcomes on specific subsets of data, which is also known as the issue of miscalibration.
We propose Meta-Embedding, a meta-learning-based approach that learns to generate desirable initial embeddings for new ad IDs.
Additionally, a "low-level sharing, high-level splitting" structure of CNN is designed to handle the documents from different content domains.
An effective technique for filtering free-rider episodes is using a partition model to divide an episode into two consecutive subepisodes and comparing the observed support of such episode with its expected support under the assumption that these two subepisodes occur independently.