Search Results for author: Wang-Chiew Tan

Found 24 papers, 11 papers with code

Unstructured and structured data: Can we have the best of both worlds with large language models?

no code implementations25 Apr 2023 Wang-Chiew Tan

This paper presents an opinion on the potential of using large language models to query on both unstructured and structured data.

Question Answering

Annotating Columns with Pre-trained Language Models

1 code implementation5 Apr 2021 Yoshihiko Suhara, Jinfeng Li, Yuliang Li, Dan Zhang, Çağatay Demiralp, Chen Chen, Wang-Chiew Tan

Inferring meta information about tables, such as column headers or relationships between columns, is an active research topic in data management as we find many tables are missing some of this information.

Columns Property Annotation Column Type Annotation +3

Convex Aggregation for Opinion Summarization

1 code implementation Findings (EMNLP) 2021 Hayate Iso, Xiaolan Wang, Yoshihiko Suhara, Stefanos Angelidis, Wang-Chiew Tan

We found that text autoencoders tend to generate overly generic summaries from simply averaged latent vectors due to an unexpected $L_2$-norm shrinkage in the aggregated latent vectors, which we refer to as summary vector degeneration.

Unsupervised Opinion Summarization

Deep or Simple Models for Semantic Tagging? It Depends on your Data [Experiments]

no code implementations11 Jul 2020 Jinfeng Li, Yuliang Li, Xiaolan Wang, Wang-Chiew Tan

We embark on a systematic study to investigate the following question: Are deep models the best performing model for all semantic tagging tasks?


Adaptive Rule Discovery for Labeling Text Data

no code implementations13 May 2020 Sainyam Galhotra, Behzad Golshan, Wang-Chiew Tan

At the same time, creating a labeled subset of the data can be costly and even infeasible in imbalanced settings.

OpinionDigest: A Simple Framework for Opinion Summarization

1 code implementation ACL 2020 Yoshihiko Suhara, Xiaolan Wang, Stefanos Angelidis, Wang-Chiew Tan

The framework uses an Aspect-based Sentiment Analysis model to extract opinion phrases from reviews, and trains a Transformer model to reconstruct the original reviews from these extractions.

Aspect-Based Sentiment Analysis (ABSA)

SubjQA: A Dataset for Subjectivity and Review Comprehension

1 code implementation EMNLP 2020 Johannes Bjerva, Nikita Bhutani, Behzad Golshan, Wang-Chiew Tan, Isabelle Augenstein

We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance.

Question Answering Sentiment Analysis +1

Enhancing Review Comprehension with Domain-Specific Commonsense

no code implementations6 Apr 2020 Aaron Traylor, Chen Chen, Behzad Golshan, Xiaolan Wang, Yuliang Li, Yoshihiko Suhara, Jinfeng Li, Cagatay Demiralp, Wang-Chiew Tan

In this paper, we introduce xSense, an effective system for review comprehension using domain-specific commonsense knowledge bases (xSense KBs).

Aspect Extraction Knowledge Distillation +3

Deep Entity Matching with Pre-Trained Language Models

1 code implementation1 Apr 2020 Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, Wang-Chiew Tan

Our experiments show that a straightforward application of language models such as BERT, DistilBERT, or RoBERTa pre-trained on large text corpora already significantly improves the matching quality and outperforms previous state-of-the-art (SOTA), by up to 29% of F1 score on benchmark datasets.

Data Augmentation Entity Resolution

Towards Productionizing Subjective Search Systems

no code implementations31 Mar 2020 Aaron Feng, Shuwei Chen, Yuliang Li, Hiroshi Matsuda, Hidekazu Tamaki, Wang-Chiew Tan

Also, we found that the existing search algorithms do not meet the search quality standard required by production systems.

Benchmarking Language Modelling +1

Sampo: Unsupervised Knowledge Base Construction for Opinions and Implications

1 code implementation AKBC 2020 Nikita Bhutani, Aaron Traylor, Chen Chen, Xiaolan Wang, Behzad Golshan, Wang-Chiew Tan

Since it can be expensive to obtain training data to learn to extract implications for each new domain of reviews, we propose an unsupervised KBC system, Sampo, Specifically, Sampo is tailored to build KBs for domains where many reviews on the same domain are available.

Snippext: Semi-supervised Opinion Mining with Augmented Data

1 code implementation7 Feb 2020 Zhengjie Miao, Yuliang Li, Xiaolan Wang, Wang-Chiew Tan

A novelty of Snippext is its clever use of a two-prong approach to achieve state-of-the-art (SOTA) performance with little labeled training data through: (1) data augmentation to automatically generate more labeled training data from existing ones, and (2) a semi-supervised learning technique to leverage the massive amount of unlabeled data in addition to the (limited amount of) labeled data.

Data Augmentation Language Modelling +1

Teddy: A System for Interactive Review Analysis

1 code implementation15 Jan 2020 Xiong Zhang, Jonathan Engel, Sara Evensen, Yuliang Li, Çağatay Demiralp, Wang-Chiew Tan

They contain a wealth of information about the opinions and experiences of users, which can help better understand consumer decisions and improve user experience with products and services.

Sato: Contextual Semantic Type Detection in Tables

1 code implementation14 Nov 2019 Dan Zhang, Yoshihiko Suhara, Jinfeng Li, Madelon Hulsebos, Çağatay Demiralp, Wang-Chiew Tan

Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search.

Column Type Annotation Information Retrieval +3

Happiness Entailment: Automating Suggestions for Well-Being

no code implementations23 Jul 2019 Sara Evensen, Yoshihiko Suhara, Alon Halevy, Vivian Li, Wang-Chiew Tan, Saran Mumick

We prototype one necessary component of such a system, the Happiness Entailment Recognition (HER) module, which takes as input a short text describing an event, a candidate suggestion, and outputs a determination about whether the suggestion is more likely to be good for this user based on the event described.

Subjective Databases

no code implementations25 Feb 2019 Yuliang Li, Aaron Xixuan Feng, Jinfeng Li, Saran Mumick, Alon Halevy, Vivian Li, Wang-Chiew Tan

In order to support experiential queries, a database system needs to model subjective data and also be able to process queries where the user can express varied subjective experiences in words chosen by the user, in addition to specifying predicates involving objective attributes.


FrameIt: Ontology Discovery for Noisy User-Generated Text

no code implementations WS 2018 Dan Iter, Alon Halevy, Wang-Chiew Tan

A common need of NLP applications is to extract structured data from text corpora in order to perform analytics or trigger an appropriate action.

Active Learning Semantic Role Labeling

Scalable Semantic Querying of Text

no code implementations3 May 2018 Xiaolan Wang, Aaron Feng, Behzad Golshan, Alon Halevy, George Mihaila, Hidekazu Oiwa, Wang-Chiew Tan

KOKO is novel in that its extraction language simultaneously supports conditions on the surface of the text and on the structure of the dependency parse tree of sentences, thereby allowing for more refined extractions.

HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments

2 code implementations LREC 2018 Akari Asai, Sara Evensen, Behzad Golshan, Alon Halevy, Vivian Li, Andrei Lopatenko, Daniela Stepanov, Yoshihiko Suhara, Wang-Chiew Tan, Yinzhan Xu

The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion.

Art Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.