Search Results for author: Jingbo Shang

Found 40 papers, 28 papers with code

META: Metadata-Empowered Weak Supervision for Text Classification

1 code implementation EMNLP 2020 Dheeraj Mekala, Xinyang Zhang, Jingbo Shang

Based on seed words, we rank and filter motif instances to distill highly label-indicative ones as {``}seed motifs{''}, which provide additional weak supervision.

Classification General Classification +1

Towards Adaptive Residual Network Training: A Neural-ODE Perspective

1 code implementation ICML 2020 chengyu dong, Liyuan Liu, Zichao Li, Jingbo Shang

Serving as a crucial factor, the depth of residual networks balances model capacity, performance, and training efficiency.

Double Descent in Adversarial Training: An Implicit Label Noise Perspective

no code implementations7 Oct 2021 chengyu dong, Liyuan Liu, Jingbo Shang

Here, we show that the robust overfitting shall be viewed as the early part of an epoch-wise double descent -- the robust test error will start to decrease again after training the model for a considerable number of epochs.

Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

no code implementations22 Sep 2021 Dheeraj Mekala, Varun Gangal, Jingbo Shang

Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases.

Classification Text Classification +1

BFClass: A Backdoor-free Text Classification Framework

no code implementations22 Sep 2021 Zichao Li, Dheeraj Mekala, chengyu dong, Jingbo Shang

To recognize the poisoned subset, we examine the training samples with these identified triggers as the most suspicious token, and check if removing the trigger will change the poisoned model's prediction.

Classification Language Modelling +1

LayoutReader: Pre-training of Text and Layout for Reading Order Detection

1 code implementation26 Aug 2021 Zilong Wang, Yiheng Xu, Lei Cui, Jingbo Shang, Furu Wei

Reading order detection is the cornerstone to understanding visually-rich documents (e. g., receipts and forms).

Optical Character Recognition

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging

2 code implementations28 May 2021 Xiaotao Gu, Zihan Wang, Zhenyu Bi, Yu Meng, Liyuan Liu, Jiawei Han, Jingbo Shang

Training a conventional neural tagger based on silver labels usually faces the risk of overfitting phrase surface names.

Keyphrase Extraction Language Modelling +2

"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models

1 code implementation18 Apr 2021 Zihan Wang, chengyu dong, Jingbo Shang

In this paper, we present an empirical property of these representations -- "average" approximates "first principal component".

Unsupervised Deep Keyphrase Generation

1 code implementation18 Apr 2021 Xianjie Shen, Yinghan Wang, Rui Meng, Jingbo Shang

Keyphrase generation aims to summarize long documents with a collection of salient phrases.

News Meets Microblog: Hashtag Annotation via Retriever-Generator

1 code implementation18 Apr 2021 Xiuwen Zheng, Dheeraj Mekala, Amarnath Gupta, Jingbo Shang

Hashtag annotation for microblog posts has been recently formulated as a sequence generation problem to handle emerging hashtags that are unseen in the training set.

Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks

no code implementations23 Feb 2021 Xinyang Zhang, Chenwei Zhang, Luna Xin Dong, Jingbo Shang, Jiawei Han

Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.

Product Categorization Text Categorization

Data Quality Matters For Adversarial Training: An Empirical Study

1 code implementation15 Feb 2021 chengyu dong, Liyuan Liu, Jingbo Shang

Specifically, we first propose a strategy to measure the data quality based on the learning behaviors of the data during adversarial training and find that low-quality data may not be useful and even detrimental to the adversarial robustness.

Sensei: Self-Supervised Sensor Name Segmentation

1 code implementation1 Jan 2021 Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang

A sensor name, typically an alphanumeric string, encodes the key context (e. g., function and location) of a sensor needed for deploying smart building applications.

Language Modelling

SeNsER: Learning Cross-Building Sensor Metadata Tagger

no code implementations Findings of the Association for Computational Linguistics 2020 Yang Jiao, Jiacheng Li, Jiaman Wu, Dezhi Hong, Rajesh Gupta, Jingbo Shang

Sensor metadata tagging, akin to the named entity recognition task, provides key contextual information (e. g., measurement type and location) about sensors for running smart building applications.

Named Entity Recognition

X-Class: Text Classification with Extremely Weak Supervision

2 code implementations NAACL 2021 Zihan Wang, Dheeraj Mekala, Jingbo Shang

In this paper, we explore to conduct text classification with extremely weak supervision, i. e., only relying on the surface text of class names.

Classification General Classification +2

Overfitting or Underfitting? Understand Robustness Drop in Adversarial Training

1 code implementation15 Oct 2020 Zichao Li, Liyuan Liu, chengyu dong, Jingbo Shang

Our goal is to understand why the robustness drops after conducting adversarial training for too long.

SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery

no code implementations EMNLP 2020 Jiaming Shen, Wenda Qiu, Jingbo Shang, Michelle Vanni, Xiang Ren, Jiawei Han

To facilitate the research on studying the interplays of these two tasks, we create the first large-scale Synonym-Enhanced Set Expansion (SE2) dataset via crowdsourcing.

Contextualized Weak Supervision for Text Classification

1 code implementation ACL 2020 Dheeraj Mekala, Jingbo Shang

Weakly supervised text classification based on a few user-provided seed words has recently attracted much attention from researchers.

Classification General Classification +1

User-Guided Aspect Classification for Domain-Specific Texts

1 code implementation30 Apr 2020 Peiran Li, Fang Guo, Jingbo Shang

Aspect classification, identifying aspects of text segments, facilitates numerous applications, such as sentiment analysis and review summarization.

Classification General Classification +2

Empower Entity Set Expansion via Language Model Probing

1 code implementation ACL 2020 Yunyi Zhang, Jiaming Shen, Jingbo Shang, Jiawei Han

Existing set expansion methods bootstrap the seed entity set by adaptively selecting context features and extracting new entities.

Language Modelling Question Answering

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

1 code implementation17 Oct 2019 Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, Jiawei Han

In this study, we propose a novel framework, SetExpan, which tackles this problem, with two techniques: (1) a context feature selection method that selects clean context features for calculating entity-entity distributional similarity, and (2) a ranking-based unsupervised ensemble method for expanding entity set based on denoised context features.

Feature Selection Question Answering

FUSE: Multi-Faceted Set Expansion by Coherent Clustering of Skip-grams

1 code implementation10 Oct 2019 Wanzheng Zhu, Hongyu Gong, Jiaming Shen, Chao Zhang, Jingbo Shang, Suma Bhat, Jiawei Han

In this paper, we study the task of multi-faceted set expansion, which aims to capture all semantic facets in the seed set and return multiple sets of entities, one for each semantic facet.

Language Modelling

Raw-to-End Name Entity Recognition in Social Media

1 code implementation14 Aug 2019 Liyuan Liu, Zihan Wang, Jingbo Shang, Dandong Yin, Heng Ji, Xiang Ren, Shaowen Wang, Jiawei Han

Our model neither requires the conversion from character sequences to word sequences, nor assumes tokenizer can correctly detect all word boundaries.

Named Entity Recognition NER +1

Arabic Named Entity Recognition: What Works and What's Next

no code implementations WS 2019 Liyuan Liu, Jingbo Shang, Jiawei Han

This paper presents the winning solution to the Arabic Named Entity Recognition challenge run by Topcoder. com.

Ensemble Learning Feature Engineering +2

Learning Named Entity Tagger using Domain-Specific Dictionary

1 code implementation EMNLP 2018 Jingbo Shang, Liyuan Liu, Xiang Ren, Xiaotao Gu, Teng Ren, Jiawei Han

Recent advances in deep neural models allow us to build reliable named entity recognition (NER) systems without handcrafting features.

Named Entity Recognition NER

Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach

1 code implementation29 Apr 2018 Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha, Jiawei Han

Different from Web or general domain search, a large portion of queries in scientific literature search are entity-set queries, that is, multiple entities of possibly different types.

Model Selection

Integrating Local Context and Global Cohesiveness for Open Information Extraction

1 code implementation26 Apr 2018 Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Ahmed El-Kishky, Jiawei Han

However, current Open IE systems focus on modeling local context information in a sentence to extract relation tuples, while ignoring the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions.

Open Information Extraction

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

1 code implementation EMNLP 2018 Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han

Many efforts have been made to facilitate natural language processing tasks with pre-trained language models (LMs), and brought significant improvements to various applications.

Language Modelling Named Entity Recognition

Investigating Rumor News Using Agreement-Aware Search

1 code implementation21 Feb 2018 Jingbo Shang, Tianhang Sun, Jiaming Shen, Xingbang Liu, Anja Gruenheid, Flip Korn, Adam Lelkes, Cong Yu, Jiawei Han

We build Maester based on the following two key observations: (1) relatedness can commonly be determined by keywords and entities occurring in both questions and articles, and (2) the level of agreement between the investigative question and the related news article can often be decided by a few key sentences.

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

2 code implementations30 Jan 2018 Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han

Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases.

Feature Engineering Multi-Task Learning +1

An Attention-based Collaboration Framework for Multi-View Network Representation Learning

1 code implementation19 Sep 2017 Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, Jiawei Han

Existing approaches usually study networks with a single type of proximity between nodes, which defines a single view of a network.

Representation Learning

Empower Sequence Labeling with Task-Aware Neural Language Model

3 code implementations13 Sep 2017 Liyuan Liu, Jingbo Shang, Frank F. Xu, Xiang Ren, Huan Gui, Jian Peng, Jiawei Han

In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task.

Language Modelling Named Entity Recognition +4

MetaPAD: Meta Pattern Discovery from Massive Text Corpora

no code implementations13 Mar 2017 Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, Jiawei Han

We propose an efficient framework, called MetaPAD, which discovers meta patterns from massive corpora with three techniques: (1) it develops a context-aware segmentation method to carefully determine the boundaries of patterns with a learnt pattern quality assessment function, which avoids costly dependency parsing and generates high-quality patterns; (2) it identifies and groups synonymous meta patterns from multiple facets---their types, contexts, and extractions; and (3) it examines type distributions of entities in the instances extracted by each group of patterns, and looks for appropriate type levels to make discovered patterns precise.

Dependency Parsing

Automated Phrase Mining from Massive Text Corpora

2 code implementations15 Feb 2017 Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R. Voss, Jiawei Han

As one of the fundamental tasks in text analysis, phrase mining aims at extracting quality phrases from a text corpus.


DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

no code implementations31 Oct 2016 Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng, Jiawei Han

In the literature, two series of models have been proposed to address prediction problems including classification and regression.

A Parallel and Efficient Algorithm for Learning to Match

no code implementations22 Oct 2014 Jingbo Shang, Tianqi Chen, Hang Li, Zhengdong Lu, Yong Yu

In this paper, we tackle this challenge with a novel parallel and efficient algorithm for feature-based matrix factorization.

Collaborative Filtering Link Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.