Metadata attributes (e. g., user and product IDs from reviews) can be incorporated as additional inputs to neural-based NLP models, by expanding the architecture of the models to improve performance.
Cross-lingual summarization consists of generating a summary in one language given an input document in a different language, allowing for the dissemination of relevant content across speakers of other languages.
While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content.
Large language models (LLMs) have been shown to perform well in answering questions and in producing long-form texts, both in few-shot closed-book settings.
Specifically, We treat sentences as basic units of matching instead of tokens, and use a sentence matching function to soft-match candidate and reference sentences.
The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details.
Towards this goal, we propose to mitigate the loss of knowledge from the interference among the different knowledge sources, by developing a modular variant of the knowledge aggregation as a new zero-shot commonsense reasoning framework.
In this tutorial, we present various aspects of opinion summarization that are useful for researchers and practitioners.
Metadata attributes (e. g., user and product IDs from reviews) can be incorporated as additional inputs to neural-based NLP models, by modifying the architecture of the models, in order to improve their performance.
The recent success of deep learning techniques for abstractive summarization is predicated on the availability of large-scale datasets.
We present the Quantized Transformer (QT), an unsupervised system for extractive opinion summarization.
We thus propose to additionally leverage references, which are selected from a large pool of texts labeled with one of the attributes, as textual information that enriches inductive biases of given attributes.
Here, we propose a novel fully unsupervised parsing approach that extracts constituency trees from PLM attention heads.
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof which we treat as pseudo-review input.
We find the novelty is not a singular concept, and thus inherently lacks of ground truth annotations with cross-annotator agreement, which is a major obstacle in evaluating these models.
We propose a state-of-the-art CLT model called Length Transfer Networks (LeTraNets) that introduces a two-way encoding scheme for short and long texts using multiple training mechanisms.
In this paper, we show that the above method is the least effective way to represent and inject attributes.
Ranked #2 on Sentiment Analysis on User and product information
This paper describes our system, Joint Encoders for Stable Suggestion Inference (JESSI), for the SemEval 2019 Task 9: Suggestion Mining from Online Reviews and Forums.
The performance of text classification has improved tremendously using intelligently engineered neural-based models, especially those injecting categorical metadata as additional information, e. g., using user/product information for sentiment classification.
Ranked #4 on Sentiment Analysis on User and product information (Yelp 2013 (Acc) metric)
Thus, we aim to eliminate these requirements and solve the sense granularity problem by proposing AutoSense, a latent variable model based on two observations: (1) senses are represented as a distribution over topics, and (2) senses generate pairings between the target word and its neighboring word.
Ranked #2 on Word Sense Induction on SemEval 2010 WSI
The same question has not been asked in the table question answering (TableQA) task, where we are tasked to answer a query given a table.
The use of user/product information in sentiment analysis is important, especially for cold-start users/products, whose number of reviews are very limited.
Ranked #4 on Sentiment Analysis on User and product information
We are the first to use translations as domain-free contexts for sentence classification.
Ranked #6 on Text Classification on TREC-6
To this end, we leverage on an off-the-shelf entity linking system (ELS) to extract linked entities and propose Entity2Topic (E2T), a module easily attachable to a sequence-to-sequence model that transforms a list of entities into a vector representation of the topic of the summary.
Ranked #22 on Text Summarization on GigaWord
This paper aims at an aspect sentiment model for aspect-based sentiment analysis (ABSA) focused on micro reviews.
The results show that the co-occurrence and citation networks constructed using the proposed method outperforms the traditional-based networks.