Search Results for author: Jingbo Shang

Found 61 papers, 37 papers with code

Phrase-aware Unsupervised Constituency Parsing

no code implementations ACL 2022 Xiaotao Gu, Yikang Shen, Jiaming Shen, Jingbo Shang, Jiawei Han

Recent studies have achieved inspiring success in unsupervised grammar induction using masked language modeling (MLM) as the proxy task.

Constituency Parsing Language Modelling +1

Towards Adaptive Residual Network Training: A Neural-ODE Perspective

1 code implementation ICML 2020 chengyu dong, Liyuan Liu, Zichao Li, Jingbo Shang

Serving as a crucial factor, the depth of residual networks balances model capacity, performance, and training efficiency.

META: Metadata-Empowered Weak Supervision for Text Classification

1 code implementation EMNLP 2020 Dheeraj Mekala, Xinyang Zhang, Jingbo Shang

Based on seed words, we rank and filter motif instances to distill highly label-indicative ones as {``}seed motifs{''}, which provide additional weak supervision.

Classification General Classification +2

“Average” Approximates “First Principal Component”? An Empirical Analysis on Representations from Neural Language Models

no code implementations EMNLP 2021 Zihan Wang, chengyu dong, Jingbo Shang

In this paper, we present an empirical property of these representations—”average” approximates “first principal component”.

Towards Collaborative Neural-Symbolic Graph Semantic Parsing via Uncertainty

no code implementations Findings (ACL) 2022 Zi Lin, Jeremiah Zhe Liu, Jingbo Shang

Recent work in task-independent graph semantic parsing has shifted from grammar-based symbolic approaches to neural models, showing strong performance on different types of meaning representations.

Semantic Parsing

Federated Learning with Client-Exclusive Classes

no code implementations1 Jan 2023 Jiayun Zhang, Xiyuan Zhang, Xinyang Zhang, Dezhi Hong, Rajesh K. Gupta, Jingbo Shang

In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i. e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes.

Federated Learning

Modeling Label Semantics Improves Activity Recognition

no code implementations1 Jan 2023 Xiyuan Zhang, Ranak Roy Chowdhury, Dezhi Hong, Rajesh K. Gupta, Jingbo Shang

We find that many activities in the current HAR datasets have shared label names, e. g., "open door" and "open fridge", "walk upstairs" and "walk downstairs".

Human Activity Recognition Time Series +1

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding

no code implementations27 Nov 2022 Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu

In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features.

Progressive Sentiment Analysis for Code-Switched Text Data

1 code implementation25 Oct 2022 Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang

Instead of training on the entire code-switched corpus at once, we create buckets based on the fraction of words in the resource-rich language and progressively train from resource-rich language dominated samples to low-resource language dominated samples.

Cross-Lingual Transfer named-entity-recognition +5

Waveformer: Linear-Time Attention with Forward and Backward Wavelet Transform

no code implementations5 Oct 2022 Yufan Zhuang, Zihan Wang, Fangbo Tao, Jingbo Shang

We propose Waveformer that learns attention mechanism in the wavelet coefficient space, requires only linear time complexity, and enjoys universal approximating power.

UCEpic: Unifying Aspect Planning and Lexical Constraints for Explainable Recommendation

no code implementations28 Sep 2022 Jiacheng Li, Zhankui He, Jingbo Shang, Julian McAuley

In this paper, we propose UCEpic, an explanation generation model that unifies aspect planning and lexical constraints for controllable personalized generation.

Explainable Recommendation Explanation Generation +2

SoTeacher: A Student-oriented Teacher Network Training Framework for Knowledge Distillation

no code implementations14 Jun 2022 chengyu dong, Liyuan Liu, Jingbo Shang

To fill this gap, we propose a novel student-oriented teacher network training framework SoTeacher, inspired by recent findings that student performance hinges on teacher's capability to approximate the true label distribution of training samples.

Data Augmentation Knowledge Distillation

LOPS: Learning Order Inspired Pseudo-Label Selection for Weakly Supervised Text Classification

1 code implementation25 May 2022 Dheeraj Mekala, chengyu dong, Jingbo Shang

Weakly supervised text classification methods typically train a deep neural classifier based on pseudo-labels.

Memorization Pseudo Label +2

Leveraging QA Datasets to Improve Generative Data Augmentation

2 code implementations25 May 2022 Dheeraj Mekala, Tu Vu, Timo Schick, Jingbo Shang

The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation.

Common Sense Reasoning Data Augmentation +3

Fine-grained Contrastive Learning for Relation Extraction

no code implementations25 May 2022 William Hogan, Jiacheng Li, Jingbo Shang

Recent relation extraction (RE) works have shown encouraging improvements by conducting contrastive learning on silver labels generated by distant supervision before fine-tuning on gold labels.

Contrastive Learning Denoising +2

WeDef: Weakly Supervised Backdoor Defense for Text Classification

no code implementations24 May 2022 Lesheng Jin, Zihan Wang, Jingbo Shang

Inspired by this observation, in WeDef, we define the reliability of samples based on whether the predictions of the weak classifier agree with their labels in the poisoned training set.

Classification text-classification +1

Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition

1 code implementation24 May 2022 Zihan Wang, Kewen Zhao, Zilong Wang, Jingbo Shang

Fine-tuning pre-trained language models has recently become a common practice in building NLP models for various tasks, especially few-shot tasks.

Few-shot NER Language Modelling +1

OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision

1 code implementation29 Apr 2022 Xinyang Zhang, Chenwei Zhang, Xian Li, Xin Luna Dong, Jingbo Shang, Christos Faloutsos, Jiawei Han

Most prior works on this matter mine new values for a set of known attributes but cannot handle new attributes that arose from constantly changing data.

Language Modelling

Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework

1 code implementation Findings (ACL) 2022 Zilong Wang, Jingbo Shang

To overcome the data limitation, we propose to leverage the label surface names to better inform the model of the target entity type semantics and also embed the labels into the spatial embedding space to capture the spatial correspondence between regions and labels.

Perturbation Deterioration: The Other Side of Catastrophic Overfitting

no code implementations29 Sep 2021 Zichao Li, Liyuan Liu, chengyu dong, Jingbo Shang

While this phenomenon is commonly explained as overfitting, we observe that it is a twin process: not only does the model catastrophic overfits to one type of perturbation, but also the perturbation deteriorates into random noise.

Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

no code implementations EMNLP 2021 Dheeraj Mekala, Varun Gangal, Jingbo Shang

Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases.

Classification text-classification +2

BFClass: A Backdoor-free Text Classification Framework

no code implementations Findings (EMNLP) 2021 Zichao Li, Dheeraj Mekala, chengyu dong, Jingbo Shang

To recognize the poisoned subset, we examine the training samples with these identified triggers as the most suspicious token, and check if removing the trigger will change the poisoned model's prediction.

Association Backdoor Attack +4

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging

2 code implementations28 May 2021 Xiaotao Gu, Zihan Wang, Zhenyu Bi, Yu Meng, Liyuan Liu, Jiawei Han, Jingbo Shang

Training a conventional neural tagger based on silver labels usually faces the risk of overfitting phrase surface names.

Keyphrase Extraction Language Modelling +2

"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models

1 code implementation18 Apr 2021 Zihan Wang, chengyu dong, Jingbo Shang

In this paper, we present an empirical property of these representations -- "average" approximates "first principal component".

News Meets Microblog: Hashtag Annotation via Retriever-Generator

1 code implementation18 Apr 2021 Xiuwen Zheng, Dheeraj Mekala, Amarnath Gupta, Jingbo Shang

Hashtag annotation for microblog posts has been recently formulated as a sequence generation problem to handle emerging hashtags that are unseen in the training set.

Unsupervised Deep Keyphrase Generation

1 code implementation18 Apr 2021 Xianjie Shen, Yinghan Wang, Rui Meng, Jingbo Shang

Keyphrase generation aims to summarize long documents with a collection of salient phrases.

Keyphrase Generation

Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks

no code implementations23 Feb 2021 Xinyang Zhang, Chenwei Zhang, Luna Xin Dong, Jingbo Shang, Jiawei Han

Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.

Product Categorization Text Categorization

Data Quality Matters For Adversarial Training: An Empirical Study

1 code implementation15 Feb 2021 chengyu dong, Liyuan Liu, Jingbo Shang

Specifically, we first propose a strategy to measure the data quality based on the learning behaviors of the data during adversarial training and find that low-quality data may not be useful and even detrimental to the adversarial robustness.

Adversarial Robustness

Sensei: Self-Supervised Sensor Name Segmentation

1 code implementation Findings (ACL) 2021 Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang

A sensor name, typically an alphanumeric string, encodes the key context (e. g., function and location) of a sensor needed for deploying smart building applications.

Language Modelling

SeNsER: Learning Cross-Building Sensor Metadata Tagger

1 code implementation Findings of the Association for Computational Linguistics 2020 Yang Jiao, Jiacheng Li, Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang

Sensor metadata tagging, akin to the named entity recognition task, provides key contextual information (e. g., measurement type and location) about sensors for running smart building applications.

named-entity-recognition Named Entity Recognition

Overfitting or Underfitting? Understand Robustness Drop in Adversarial Training

2 code implementations15 Oct 2020 Zichao Li, Liyuan Liu, chengyu dong, Jingbo Shang

Our goal is to understand why the robustness drops after conducting adversarial training for too long.

SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery

no code implementations EMNLP 2020 Jiaming Shen, Wenda Qiu, Jingbo Shang, Michelle Vanni, Xiang Ren, Jiawei Han

To facilitate the research on studying the interplays of these two tasks, we create the first large-scale Synonym-Enhanced Set Expansion (SE2) dataset via crowdsourcing.

Contextualized Weak Supervision for Text Classification

1 code implementation ACL 2020 Dheeraj Mekala, Jingbo Shang

Weakly supervised text classification based on a few user-provided seed words has recently attracted much attention from researchers.

Classification General Classification +2

User-Guided Aspect Classification for Domain-Specific Texts

1 code implementation30 Apr 2020 Peiran Li, Fang Guo, Jingbo Shang

Aspect classification, identifying aspects of text segments, facilitates numerous applications, such as sentiment analysis and review summarization.

Classification General Classification +3

Empower Entity Set Expansion via Language Model Probing

1 code implementation ACL 2020 Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han

Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities.

Language Modelling Question Answering

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

1 code implementation17 Oct 2019 Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, Jiawei Han

In this study, we propose a novel framework, SetExpan, which tackles this problem, with two techniques: (1) a context feature selection method that selects clean context features for calculating entity-entity distributional similarity, and (2) a ranking-based unsupervised ensemble method for expanding entity set based on denoised context features.

Question Answering

FUSE: Multi-Faceted Set Expansion by Coherent Clustering of Skip-grams

1 code implementation10 Oct 2019 Wanzheng Zhu, Hongyu Gong, Jiaming Shen, Chao Zhang, Jingbo Shang, Suma Bhat, Jiawei Han

In this paper, we study the task of multi-faceted set expansion, which aims to capture all semantic facets in the seed set and return multiple sets of entities, one for each semantic facet.

Language Modelling

Raw-to-End Name Entity Recognition in Social Media

1 code implementation14 Aug 2019 Liyuan Liu, Zihan Wang, Jingbo Shang, Dandong Yin, Heng Ji, Xiang Ren, Shaowen Wang, Jiawei Han

Our model neither requires the conversion from character sequences to word sequences, nor assumes tokenizer can correctly detect all word boundaries.

named-entity-recognition NER

Arabic Named Entity Recognition: What Works and What's Next

no code implementations WS 2019 Liyuan Liu, Jingbo Shang, Jiawei Han

This paper presents the winning solution to the Arabic Named Entity Recognition challenge run by Topcoder. com.

Ensemble Learning Feature Engineering +3

Learning Named Entity Tagger using Domain-Specific Dictionary

1 code implementation EMNLP 2018 Jingbo Shang, Liyuan Liu, Xiang Ren, Xiaotao Gu, Teng Ren, Jiawei Han

Recent advances in deep neural models allow us to build reliable named entity recognition (NER) systems without handcrafting features.

named-entity-recognition NER

Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach

1 code implementation29 Apr 2018 Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha, Jiawei Han

Different from Web or general domain search, a large portion of queries in scientific literature search are entity-set queries, that is, multiple entities of possibly different types.

Model Selection

Integrating Local Context and Global Cohesiveness for Open Information Extraction

1 code implementation26 Apr 2018 Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Ahmed El-Kishky, Jiawei Han

However, current Open IE systems focus on modeling local context information in a sentence to extract relation tuples, while ignoring the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions.

Open Information Extraction

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

1 code implementation EMNLP 2018 Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han

Many efforts have been made to facilitate natural language processing tasks with pre-trained language models (LMs), and brought significant improvements to various applications.

Language Modelling Named Entity Recognition

Investigating Rumor News Using Agreement-Aware Search

1 code implementation21 Feb 2018 Jingbo Shang, Tianhang Sun, Jiaming Shen, Xingbang Liu, Anja Gruenheid, Flip Korn, Adam Lelkes, Cong Yu, Jiawei Han

We build Maester based on the following two key observations: (1) relatedness can commonly be determined by keywords and entities occurring in both questions and articles, and (2) the level of agreement between the investigative question and the related news article can often be decided by a few key sentences.

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

2 code implementations30 Jan 2018 Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han

Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases.

Feature Engineering Multi-Task Learning +2

An Attention-based Collaboration Framework for Multi-View Network Representation Learning

1 code implementation19 Sep 2017 Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, Jiawei Han

Existing approaches usually study networks with a single type of proximity between nodes, which defines a single view of a network.

Representation Learning

Empower Sequence Labeling with Task-Aware Neural Language Model

3 code implementations13 Sep 2017 Liyuan Liu, Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han

In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task.

Language Modelling named-entity-recognition +4

MetaPAD: Meta Pattern Discovery from Massive Text Corpora

no code implementations13 Mar 2017 Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, Jiawei Han

We propose an efficient framework, called MetaPAD, which discovers meta patterns from massive corpora with three techniques: (1) it develops a context-aware segmentation method to carefully determine the boundaries of patterns with a learnt pattern quality assessment function, which avoids costly dependency parsing and generates high-quality patterns; (2) it identifies and groups synonymous meta patterns from multiple facets---their types, contexts, and extractions; and (3) it examines type distributions of entities in the instances extracted by each group of patterns, and looks for appropriate type levels to make discovered patterns precise.

Dependency Parsing

Automated Phrase Mining from Massive Text Corpora

4 code implementations15 Feb 2017 Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R. Voss, Jiawei Han

As one of the fundamental tasks in text analysis, phrase mining aims at extracting quality phrases from a text corpus.

General Knowledge POS +1

DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

no code implementations31 Oct 2016 Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng, Jiawei Han

In the literature, two series of models have been proposed to address prediction problems including classification and regression.

Meta-Path Guided Embedding for Similarity Search in Large-Scale Heterogeneous Information Networks

1 code implementation31 Oct 2016 Jingbo Shang, Meng Qu, Jialu Liu, Lance M. Kaplan, Jiawei Han, Jian Peng

It models vertices as low-dimensional vectors to explore network structure-embedded similarity.

A Parallel and Efficient Algorithm for Learning to Match

no code implementations22 Oct 2014 Jingbo Shang, Tianqi Chen, Hang Li, Zhengdong Lu, Yong Yu

In this paper, we tackle this challenge with a novel parallel and efficient algorithm for feature-based matrix factorization.

Collaborative Filtering Link Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.