Opinion summarization focuses on generating summaries that reflect popular subjective information expressed in multiple online reviews.
Inferring meta information about tables, such as column headers or relationships between columns, is an active research topic in data management as we find many tables are missing some of this information.
Ranked #1 on Column Type Annotation on VizNet-Sato-MultiColumn
We found that text autoencoders tend to generate overly generic summaries from simply averaged latent vectors due to an unexpected $L_2$-norm shrinkage in the aggregated latent vectors, which we refer to as summary vector degeneration.
Ranked #1 on Unsupervised Opinion Summarization on Amazon
We present the Quantized Transformer (QT), an unsupervised system for extractive opinion summarization.
The framework uses an Aspect-based Sentiment Analysis model to extract opinion phrases from reviews, and trains a Transformer model to reconstruct the original reviews from these extractions.
In this paper, we introduce xSense, an effective system for review comprehension using domain-specific commonsense knowledge bases (xSense KBs).
Our experiments show that a straightforward application of language models such as BERT, DistilBERT, or RoBERTa pre-trained on large text corpora already significantly improves the matching quality and outperforms previous state-of-the-art (SOTA), by up to 29% of F1 score on benchmark datasets.
Ranked #2 on Entity Resolution on WDC Watches-xlarge
We also analyzed the differences between the expert and non-expert machine algorithms based on their neural representations to evaluate the performances, providing insight into the human experts' and non-experts' cognitive abilities.
Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search.
Ranked #2 on Column Type Annotation on VizNet-Sato-MultiColumn
We present Emu, a system that semantically enhances multilingual sentence embeddings.
We prototype one necessary component of such a system, the Happiness Entailment Recognition (HER) module, which takes as input a short text describing an event, a candidate suggestion, and outputs a determination about whether the suggestion is more likely to be good for this user based on the event described.
The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion.
Entity population, a task of collecting entities that belong to a particular category, has attracted attention from vertical domains.