Search Results for author: Guoyin Wang

Found 87 papers, 35 papers with code

Methods for Numeracy-Preserving Word Embeddings

no code implementations • EMNLP 2020 • Dhanasekar Sundararaman, Shijing Si, Vivek Subramanian, Guoyin Wang, Devamanyu Hazarika, Lawrence Carin

We propose a new methodology to assign and learn embeddings for numbers.

Question Answering Word Embeddings

Paper
Add Code

Dialogue Response Generation via Contrastive Latent Representation Learning

no code implementations • EMNLP (NLP4ConvAI) 2021 • Shuyang Dai, Guoyin Wang, Sunghyun Park, Sungjin Lee

In this work, we aim to construct a robust sentence representation learning model, that is specifically designed for dialogue response generation, with Transformer-based encoder-decoder structure.

Contrastive Learning Representation Learning +2

Paper
Add Code

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

1 code implementation • 25 Mar 2024 • Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang

We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs.

Paper
Code

Open Continual Feature Selection via Granular-Ball Knowledge Transfer

no code implementations • 15 Mar 2024 • Xuemei Cao, Xin Yang, Shuyin Xia, Guoyin Wang, Tianrui Li

To this end, the proposed CFS method combines the strengths of continual learning (CL) with granular-ball computing (GBC), which focuses on constructing a granular-ball knowledge base to detect unknown classes and facilitate the transfer of previously learned knowledge for further feature selection.

Continual Learning feature selection +1

Paper
Add Code

Empowering Large Language Model Agents through Action Learning

1 code implementation • 24 Feb 2024 • Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior.

Language Modelling Large Language Model

Paper
Code

LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild

no code implementations • 15 Feb 2024 • Ziyu Zhao, Leilei Gan, Guoyin Wang, Wangchunshu Zhou, Hongxia Yang, Kun Kuang, Fei Wu

Low-Rank Adaptation (LoRA) provides an effective yet efficient solution for fine-tuning large language models (LLM).

Retrieval

Paper
Add Code

Similarity-based Neighbor Selection for Graph LLMs

1 code implementation • 6 Feb 2024 • Rui Li, Jiwei Li, Jiawei Han, Guoyin Wang

Our research further underscores the significance of graph structure integration in LLM applications and identifies key factors for their success in node classification.

Node Classification

Paper
Code

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

1 code implementation • 10 Jan 2024 • Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, Jing Su, Jingjing Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks.

Benchmarking

Paper
Code

Multi-Granularity Representation Learning for Sketch-based Dynamic Face Image Retrieval

1 code implementation • 31 Dec 2023 • Liang Wang, Dawei Dai, Shiyu Fu, Guoyin Wang

In specific scenarios, face sketch can be used to identify a person.

Face Image Retrieval Representation Learning +1

Paper
Code

Multi-granularity Causal Structure Learning

no code implementations • 9 Dec 2023 • Jiaxuan Liang, Jun Wang, Guoxian Yu, Shuyin Xia, Guoyin Wang

Unveil, model, and comprehend the causal mechanisms underpinning natural phenomena stand as fundamental endeavors across myriad scientific disciplines.

Paper
Add Code

Sim-GPT: Text Similarity via GPT Annotated Data

1 code implementation • 9 Dec 2023 • Shuhe Wang, Beiming Cao, Shengyu Zhang, Xiaoya Li, Jiwei Li, Fei Wu, Guoyin Wang, Eduard Hovy

Due to the lack of a large collection of high-quality labeled sentence pairs with textual similarity scores, existing approaches for Semantic Textual Similarity (STS) mostly rely on unsupervised techniques or training signals that are only partially correlated with textual similarity, e. g., NLI-based datasets.

Semantic Textual Similarity Sentence +2

Paper
Code

Sentiment Analysis through LLM Negotiations

no code implementations • 3 Nov 2023 • Xiaofei Sun, Xiaoya Li, Shengyu Zhang, Shuhe Wang, Fei Wu, Jiwei Li, Tianwei Zhang, Guoyin Wang

A standard paradigm for sentiment analysis is to rely on a singular LLM and makes the decision in a single round under the framework of in-context learning.

In-Context Learning Sentiment Analysis +1

Paper
Add Code

A Causal Disentangled Multi-Granularity Graph Classification Method

no code implementations • 25 Oct 2023 • Yuan Li, Li Liu, Penggang Chen, Youmin Zhang, Guoyin Wang

Graph data widely exists in real life, with large amounts of data and complex structures.

Disentanglement Graph Classification +1

Paper
Add Code

Long-Tailed Classification Based on Coarse-Grained Leading Forest and Multi-Center Loss

1 code implementation • 12 Oct 2023 • Jinye Yang, Ji Xu, Di wu, Jianhang Tang, Shaobo Li, Guoyin Wang

The deviation of a classification model is caused by both class-wise and attribute-wise imbalance.

Attribute Classification +1

Paper
Code

Language Models As Semantic Indexers

no code implementations • 11 Oct 2023 • Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, Xianfeng Tang

Semantic identifier (ID) is an important concept in information retrieval that aims to preserve the semantics of objects such as documents and items inside their IDs.

Contrastive Learning Information Retrieval +2

Paper
Add Code

Are Human-generated Demonstrations Necessary for In-context Learning?

1 code implementation • 26 Sep 2023 • Rui Li, Guoyin Wang, Jiwei Li

In this paper, we raise the fundamental question that whether human-generated demonstrations are necessary for ICL.

Arithmetic Reasoning Code Generation +4

Paper
Code

Instruction Tuning for Large Language Models: A Survey

1 code implementation • 21 Aug 2023 • Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, Guoyin Wang

This paper surveys research works in the quickly advancing field of instruction tuning (IT), a crucial technique to enhance the capabilities and controllability of large language models (LLMs).

Paper
Code

Pushing the Limits of ChatGPT on NLP Tasks

no code implementations • 16 Jun 2023 • Xiaofei Sun, Linfeng Dong, Xiaoya Li, Zhen Wan, Shuhe Wang, Tianwei Zhang, Jiwei Li, Fei Cheng, Lingjuan Lyu, Fei Wu, Guoyin Wang

In this work, we propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks.

Dependency Parsing Event Extraction +9

Paper
Add Code

GBG++: A Fast and Stable Granular Ball Generation Method for Classification

no code implementations • 29 May 2023 • Qin Xie, Qinghua Zhang, Shuyin Xia, Fan Zhao, Chengying Wu, Guoyin Wang, Weiping Ding

Second, considering the influence of the sample size within the GB on the GB's quality, based on the GBG++ method, an improved GB-based $k$-nearest neighbors algorithm (GB$k$NN++) is presented, which can reduce misclassification at the class boundary.

Outlier Detection

Paper
Add Code

TaDSE: Template-aware Dialogue Sentence Embeddings

no code implementations • 23 May 2023 • Minsik Oh, Jiwei Li, Guoyin Wang

We further introduce a novel analytic instrument of Semantic Compression method, for which we discover a correlation with uniformity and alignment.

Contrastive Learning intent-classification +6

Paper
Add Code

Text Classification via Large Language Models

1 code implementation • 15 May 2023 • Xiaofei Sun, Xiaoya Li, Jiwei Li, Fei Wu, Shangwei Guo, Tianwei Zhang, Guoyin Wang

This is due to (1) the lack of reasoning ability in addressing complex linguistic phenomena (e. g., intensification, contrast, irony etc); (2) limited number of tokens allowed in in-context learning.

Domain Adaptation In-Context Learning +3

Paper
Code

Towards Building the Federated GPT: Federated Instruction Tuning

1 code implementation • 9 May 2023 • Jianyi Zhang, Saeed Vahidian, Martin Kuo, Chunyuan Li, Ruiyi Zhang, Tong Yu, Yufan Zhou, Guoyin Wang, Yiran Chen

This repository offers a foundational framework for exploring federated fine-tuning of LLMs using heterogeneous instructions across diverse categories.

Federated Learning

186

Paper
Code

Granular-ball computing: an efficient, robust, and interpretable adaptive multi-granularity representation and computation method

no code implementations • 21 Apr 2023 • Shuyin Xia, Guoyin Wang, Xinbo Gao, Xiaoyu Lian

This mechanism inherently possesses an adaptive multi-granularity description capacity, resulting in computational traits such as efficiency, robustness, and interpretability.

Paper
Add Code

GPT-NER: Named Entity Recognition via Large Language Models

1 code implementation • 20 Apr 2023 • Shuhe Wang, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei Zhang, Jiwei Li, Guoyin Wang

GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e. g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus## is a city", where special tokens @@## marks the entity to extract.

Hallucination named-entity-recognition +4

161

Paper
Code

Granular-ball Optimization Algorithm

no code implementations • 18 Mar 2023 • Shuyin Xia, Jiancu Chen, Bin Hou, Guoyin Wang

The faster speed, higher approximation ability of optimal solution, no hyper-parameters, and simpler design of GBO make it an all-around replacement of most of the existing popular intelligent optimization algorithms.

Paper
Add Code

Open World Classification with Adaptive Negative Samples

no code implementations • 9 Mar 2023 • Ke Bai, Guoyin Wang, Jiwei Li, Sunghyun Park, Sungjin Lee, Puyang Xu, Ricardo Henao, Lawrence Carin

Open world classification is a task in natural language processing with key practical relevance and impact.

Classification

Paper
Add Code

Research on Efficient Fuzzy Clustering Method Based on Local Fuzzy Granular balls

no code implementations • 7 Mar 2023 • Jiang Xie, Qiao Deng, Shuyin Xia, Yangzhou Zhao, Guoyin Wang, Xinbo Gao

In recent years, the problem of fuzzy clustering has been widely concerned.

Clustering

Paper
Add Code

GBMST: An Efficient Minimum Spanning Tree Clustering Based on Granular-Ball Computing

no code implementations • 2 Mar 2023 • Jiang Xie, Shuyin Xia, Guoyin Wang, Xinbo Gao

We construct coarsegrained granular-balls, and then use granular-balls and MST to implement the clustering method based on "large-scale priority", which can greatly avoid the influence of outliers and accelerate the construction process of MST.

Clustering

Paper
Add Code

PK-ICR: Persona-Knowledge Interactive Context Retrieval for Grounded Dialogue

1 code implementation • 13 Feb 2023 • Minsik Oh, Joosung Lee, Jiwei Li, Guoyin Wang

Identifying relevant persona or knowledge for conversational systems is critical to grounded dialogue response generation.

Data Augmentation Response Generation +1

Paper
Code

Sketch Less Face Image Retrieval: A New Challenge

1 code implementation • 11 Feb 2023 • Dawei Dai, Yutang Li, Liang Wang, Shiyu Fu, Shuyin Xia, Guoyin Wang

In this study, we proposed a new task named sketch less face image retrieval (SLFIR), in which the retrieval was carried out at each stroke and aim to retrieve the target face photo using a partial sketch with as few strokes as possible (see Fig. 1).

Face Image Retrieval Retrieval

Paper
Code

A novel cluster internal evaluation index based on hyper-balls

no code implementations • 30 Dec 2022 • Jiang Xie, Pengfei Zhao, Shuyin Xia, Guoyin Wang, Dongdong Cheng

It is crucial to evaluate the quality and determine the optimal number of clusters in cluster analysis.

Clustering

Paper
Add Code

WL-Align: Weisfeiler-Lehman Relabeling for Aligning Users across Networks via Regularized Representation Learning

1 code implementation • 29 Dec 2022 • Li Liu, Penggang Chen, Xin Li, William K. Cheung, Youmin Zhang, Qun Liu, Guoyin Wang

Aligning users across networks using graph representation learning has been found effective where the alignment is accomplished in a low-dimensional embedding space.

Graph Representation Learning

Paper
Code

GNN-SL: Sequence Labeling Based on Nearest Examples via GNN

1 code implementation • 5 Dec 2022 • Shuhe Wang, Yuxian Meng, Rongbin Ouyang, Jiwei Li, Tianwei Zhang, Lingjuan Lyu, Guoyin Wang

To better handle long-tail cases in the sequence labeling (SL) task, in this work, we introduce graph neural networks sequence labeling (GNN-SL), which augments the vanilla SL model output with similar tagging examples retrieved from the whole training set.

Chinese Word Segmentation named-entity-recognition +4

Paper
Code

Granular-Ball Fuzzy Set and Its Implementation in SVM

no code implementations • 21 Oct 2022 • Shuyin Xia, Xiaoyu Lian, Guoyin Wang, Xinbo Gao, Yabin Shao

Most existing fuzzy set methods use points as their input, which is the finest granularity from the perspective of granular computing.

Paper
Add Code

GBSVM: Granular-ball Support Vector Machine

1 code implementation • 6 Oct 2022 • Shuyin Xia, Xiaoyu Lian, Guoyin Wang, Xinbo Gao, Jiancu Chen, Xiaoli Peng

Furthermore, a particle swarm optimization algorithm is designed to solve the dual model.

Paper
Code

Ranking-Enhanced Unsupervised Sentence Representation Learning

1 code implementation • 9 Sep 2022 • Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh

In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence.

Contrastive Learning Data Augmentation +5

Paper
Code

Semi-supervised Learning with Deterministic Labeling and Large Margin Projection

1 code implementation • 17 Aug 2022 • Ji Xu, Gang Ren, Yao Xiao, Shaobo Li, Guoyin Wang

Optimal leading forest (OLF) has been observed to have the advantage of revealing the difference evolution along a path within a subtree.

Active Learning Attribute

Paper
Code

Learning Personalized Representations using Graph Convolutional Network

no code implementations • 28 Jul 2022 • Hongyu Shen, Jinoh Oh, Shuai Zhao, Guoyin Wang, Tara Taghavi, Sungjin Lee

Then we propose a graph convolutional network(GCN) based model, namely Personalized Dynamic Routing Feature Encoder(PDRFE), that generates personalized customer representations learned from the built graph.

Paper
Add Code

Advanced Conditional Variational Autoencoders (A-CVAE): Towards interpreting open-domain conversation generation via disentangling latent feature representation

no code implementations • 26 Jul 2022 • Ye Wang, Jingbo Liao, Hong Yu, Guoyin Wang, Xiaoxia Zhang, Li Liu

Particularly, the model integrates the macro-level guided-category knowledge and micro-level open-domain dialogue data for the training, leveraging the priori knowledge into the latent space, which enables the model to disentangle the latent variables within the mesoscopic scale.

Disentanglement

Paper
Add Code

GBC: An Efficient and Adaptive Clustering Algorithm Based on Granular-Ball

no code implementations • 29 May 2022 • Shuyin Xia, Jiang Xie, Guoyin Wang

Existing clustering methods are based on a single granularity of information, such as the distance and density of each data.

Astronomy Clustering

Paper
Add Code

Improving Downstream Task Performance by Treating Numbers as Entities

no code implementations • 7 May 2022 • Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Liyan Xu, Lawrence Carin

Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed.

Classification Question Answering

Paper
Add Code

$k$NN-NER: Named Entity Recognition with Nearest Neighbor Search

1 code implementation • 31 Mar 2022 • Shuhe Wang, Xiaoya Li, Yuxian Meng, Tianwei Zhang, Rongbin Ouyang, Jiwei Li, Guoyin Wang

Inspired by recent advances in retrieval augmented methods in NLP~\citep{khandelwal2019generalization, khandelwal2020nearest, meng2021gnn}, in this paper, we introduce a $k$ nearest neighbor NER ($k$NN-NER) framework, which augments the distribution of entity labels by assigning $k$ nearest neighbors retrieved from the training set.

Few-Shot Learning named-entity-recognition +3

Paper
Code

One-Stage Deep Edge Detection Based on Dense-Scale Feature Fusion and Pixel-Level Imbalance Learning

no code implementations • 17 Mar 2022 • Dawei Dai, Chunjie Wang, Shuyin Xia, Yingge Liu, Guoyin Wang

Edge detection, a basic task in the field of computer vision, is an important preprocessing operation for the recognition and understanding of a visual scene.

Edge Detection

Paper
Add Code

Multi-granularity Association Learning Framework for on-the-fly Fine-Grained Sketch-based Image Retrieval

no code implementations • 13 Jan 2022 • Dawei Dai, Xiaoyu Tang, Shuyin Xia, Yingge Liu, Guoyin Wang, Zizhong Chen

We consider that there is a significant correlation among these incomplete sketches in the sketch drawing episode of each photo.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

An Efficient and Adaptive Granular-ball Generation Method in Classification Problem

no code implementations • 12 Jan 2022 • Shuyin Xia, Xiaochuan Dai, Guoyin Wang, Xinbo Gao, Elisabeth Giem

In addition, this paper first provides the mathematical models for the granular-ball covering.

Paper
Add Code

A Unified Granular-ball Learning Model of Pawlak Rough Set and Neighborhood Rough Set

no code implementations • 10 Jan 2022 • Shuyin Xia, Cheng Wang, Guoyin Wang, Weiping Ding, Xinbo Gao, JianHang Yu, Yujia Zhai, Zizhong Chen

The granular-ball rough set can simultaneously represent Pawlak rough sets, and the neighborhood rough set, so as to realize the unified representation of the two.

feature selection

Paper
Add Code

An Efficient and Accurate Rough Set for Feature Selection, Classification and Knowledge Representation

no code implementations • 29 Dec 2021 • Shuyin Xia, Xinyu Bai, Guoyin Wang, Deyu Meng, Xinbo Gao, Zizhong Chen, Elisabeth Giem

This paper present a strong data mining method based on rough set, which can realize feature selection, classification and knowledge representation at the same time.

Attribute feature selection

Paper
Add Code

Faster Nearest Neighbor Machine Translation

no code implementations • 15 Dec 2021 • Shuhe Wang, Jiwei Li, Yuxian Meng, Rongbin Ouyang, Guoyin Wang, Xiaoya Li, Tianwei Zhang, Shi Zong

The core idea of Faster $k$NN-MT is to use a hierarchical clustering strategy to approximate the distance between the query and a data point in the datastore, which is decomposed into two parts: the distance between the query and the center of the cluster that the data point belongs to, and the distance between the data point and the cluster center.

Machine Translation Translation

Paper
Add Code

Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors

1 code implementation • 22 Nov 2021 • Zihan Yan, Li Liu, Xin Li, William K. Cheung, Youmin Zhang, Qun Liu, Guoyin Wang

Social network alignment aims at aligning person identities across social networks.

Meta-Learning

Paper
Code

Rethinking the Image Feature Biases Exhibited by Deep CNN Models

no code implementations • 3 Nov 2021 • Dawei Dai, Yutang Li, Huanan Bao, Sy Xia, Guoyin Wang, Xiaoli Ma

From the results, we conclude that (1) the combined effect of certain features is typically far more influential than any single feature; (2) in different tasks, neural models can perform different biases, that is, we can design a specific task to make a neural model biased toward a specific anticipated feature.

Paper
Add Code

Interpreting Deep Learning Models in Natural Language Processing: A Review

no code implementations • 20 Oct 2021 • Xiaofei Sun, Diyi Yang, Xiaoya Li, Tianwei Zhang, Yuxian Meng, Han Qiu, Guoyin Wang, Eduard Hovy, Jiwei Li

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks.

Paper
Add Code

Deciding Whether to Ask Clarifying Questions in Large-Scale Spoken Language Understanding

no code implementations • 25 Sep 2021 • Joo-Kyung Kim, Guoyin Wang, Sungjin Lee, Young-Bum Kim

A large-scale conversational agent can suffer from understanding user utterances with various ambiguities such as ASR ambiguity, intent ambiguity, and hypothesis ambiguity.

Spoken Language Understanding

Paper
Add Code

An MRC Framework for Semantic Role Labeling

1 code implementation • COLING 2022 • Nan Wang, Jiwei Li, Yuxian Meng, Xiaofei Sun, Han Qiu, Ziyao Wang, Guoyin Wang, Jun He

We formalize predicate disambiguation as multiple-choice machine reading comprehension, where the descriptions of candidate senses of a given predicate are used as options to select the correct sense.

Ranked #1 on Semantic Role Labeling on CoNLL 2005

Computational Efficiency Machine Reading Comprehension +3

Paper
Code

AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation

1 code implementation • ACL 2021 • Xinnuo Xu, Guoyin Wang, Young-Bum Kim, Sungjin Lee

Natural Language Generation (NLG) is a key component in a task-oriented dialogue system, which converts the structured meaning representation (MR) to the natural language.

Data Augmentation Retrieval +2

Paper
Code

D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and Localization

no code implementations • 3 Dec 2020 • Bo Liu, Ranglei Wu, Xiuli Bi, Bin Xiao, Weisheng Li, Guoyin Wang, Xinbo Gao

The unfixed encoder autonomously learns the image fingerprints that differentiate between the tampered and non-tampered regions, whereas the fixed encoder intentionally provides the direction information that assists the learning and detection of the network.

Binary Classification

Paper
Add Code

Integrating Task Specific Information into Pretrained Language Models for Low Resource Fine Tuning

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Rui Wang, Shijing Si, Guoyin Wang, Lei Zhang, Lawrence Carin, Ricardo Henao

Pretrained Language Models (PLMs) have improved the performance of natural language understanding in recent years.

Natural Language Understanding

Paper
Code

LRA: an accelerated rough set framework based on local redundancy of attribute for feature selection

no code implementations • 31 Oct 2020 • Shuyin Xia, Wenhua Li, Guoyin Wang, Xinbo Gao, Changqing Zhang, Elisabeth Giem

Based on the theorem, we propose the LRA framework for accelerating rough set algorithms.

Attribute feature selection

Paper
Add Code

Improving Text Generation with Student-Forcing Optimal Transport

no code implementations • EMNLP 2020 • Guoyin Wang, Chunyuan Li, Jianqiao Li, Hao Fu, Yuh-Chen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang, Lawrence Carin

An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences.

Machine Translation Text Generation +2

Paper
Add Code

Weakly supervised cross-domain alignment with optimal transport

no code implementations • 14 Aug 2020 • Siyang Yuan, Ke Bai, Liqun Chen, Yizhe Zhang, Chenyang Tao, Chunyuan Li, Guoyin Wang, Ricardo Henao, Lawrence Carin

Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing.

Paper
Add Code

Students Need More Attention: BERT-based AttentionModel for Small Data with Application to AutomaticPatient Message Triage

1 code implementation • 22 Jun 2020 • Shijing Si, Rui Wang, Jedrek Wosik, Hao Zhang, David Dov, Guoyin Wang, Ricardo Henao, Lawrence Carin

Small and imbalanced datasets commonly seen in healthcare represent a challenge when training classifiers based on deep learning models.

Paper
Code

Improving Adversarial Text Generation by Modeling the Distant Future

no code implementations • ACL 2020 • Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Dinghan Shen, Guoyin Wang, Zheng Wen, Lawrence Carin

Auto-regressive text generation models usually focus on local fluency, and may cause inconsistent semantic meaning in long text generation.

Adversarial Text Imitation Learning +1

Paper
Add Code

Ball k-means

no code implementations • 2 May 2020 • Shuyin Xia, Daowan Peng, Deyu Meng, Changqing Zhang, Guoyin Wang, Zizhong Chen, Wei Wei

The assigned cluster of the points in the stable area is not changed in the current iteration while the points in the annulus area will be adjusted within a few neighbor clusters in the current iteration.

Clustering

Paper
Add Code

POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training

1 code implementation • EMNLP 2020 • Yizhe Zhang, Guoyin Wang, Chunyuan Li, Zhe Gan, Chris Brockett, Bill Dolan

Large-scale pre-trained language models, such as BERT and GPT-2, have achieved excellent performance in language representation learning and free-form text generation.

Language Modelling Representation Learning +1

112

Paper
Code

Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

no code implementations • 20 Nov 2019 • Wenlin Wang, Hongteng Xu, Zhe Gan, Bai Li, Guoyin Wang, Liqun Chen, Qian Yang, Wenqi Wang, Lawrence Carin

We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework.

Multi-Task Learning Type prediction

Paper
Add Code

Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding

no code implementations • 10 Nov 2019 • Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Shijing Si, Dinghan Shen, Dong Wang, Lawrence Carin

Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks.

Machine Translation Natural Language Understanding +2

Paper
Add Code

An End-to-End Generative Architecture for Paraphrase Generation

no code implementations • IJCNLP 2019 • Qian Yang, Zhouyuan Huo, Dinghan Shen, Yong Cheng, Wenlin Wang, Guoyin Wang, Lawrence Carin

Generating high-quality paraphrases is a fundamental yet challenging natural language processing task.

Paraphrase Generation

Paper
Add Code

Zero-Shot Recognition via Optimal Transport

no code implementations • 20 Oct 2019 • Wenlin Wang, Hongteng Xu, Guoyin Wang, Wenqi Wang, Lawrence Carin

{Specifically, we build a conditional generative model to generate features from seen-class attributes, and establish an optimal transport between the distribution of the generated features and that of the real features.}

Attribute Generalized Zero-Shot Learning

Paper
Add Code

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

1 code implementation • NeurIPS 2019 • Kevin J Liang, Guoyin Wang, Yitong Li, Ricardo Henao, Lawrence Carin

We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally.

Paper
Code

Improving Textual Network Learning with Variational Homophilic Embeddings

1 code implementation • NeurIPS 2019 • Wenlin Wang, Chenyang Tao, Zhe Gan, Guoyin Wang, Liqun Chen, Xinyuan Zhang, Ruiyi Zhang, Qian Yang, Ricardo Henao, Lawrence Carin

This paper considers a novel variational formulation of network embeddings, with special focus on textual networks.

Network Embedding

Paper
Code

Learning Word Embeddings with Domain Awareness

1 code implementation • 7 Jun 2019 • Guoyin Wang, Yan Song, Yue Zhang, Dong Yu

Word embeddings are traditionally trained on a large corpus in an unsupervised setting, with no specific design for incorporating domain knowledge.

Learning Word Embeddings

Paper
Code

Improving Textual Network Embedding with Global Attention via Optimal Transport

no code implementations • ACL 2019 • Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Pengyu Cheng, Xinyuan Zhang, Wenlin Wang, Yizhe Zhang, Lawrence Carin

Constituting highly informative network embeddings is an important tool for network analysis.

Network Embedding

Paper
Add Code

Topic-Guided Variational Auto-Encoder for Text Generation

no code implementations • NAACL 2019 • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin

We propose a topic-guided variational auto-encoder (TGVAE) model for text generation.

Conditional Text Generation

Paper
Add Code

Discriminative Clustering for Robust Unsupervised Domain Adaptation

no code implementations • 30 May 2019 • Rui Wang, Guoyin Wang, Ricardo Henao

Unsupervised domain adaptation seeks to learn an invariant and discriminative representation for an unlabeled target domain by leveraging the information of a labeled source dataset.

Clustering Partial Domain Adaptation +1

Paper
Add Code

Topic-Guided Variational Autoencoders for Text Generation

no code implementations • 17 Mar 2019 • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, Lawrence Carin

We propose a topic-guided variational autoencoder (TGVAE) model for text generation.

Conditional Text Generation

Paper
Add Code

Adversarial Learning of a Sampler Based on an Unnormalized Distribution

3 code implementations • 3 Jan 2019 • Chunyuan Li, Ke Bai, Jianqiao Li, Guoyin Wang, Changyou Chen, Lawrence Carin

We investigate adversarial learning in the case when only an unnormalized form of the density can be accessed, rather than samples.

Q-Learning

Paper
Code

Generative Adversarial Network Training is a Continual Learning Problem

no code implementations • ICLR 2019 • Kevin J Liang, Chunyuan Li, Guoyin Wang, Lawrence Carin

We hypothesize that this is at least in part due to the evolution of the generator distribution and the catastrophic forgetting tendency of neural networks, which leads to the discriminator losing the ability to remember synthesized samples from previous instantiations of the generator.

Continual Learning Generative Adversarial Network +1

Paper
Add Code

Sequence Generation with Guider Network

no code implementations • 2 Nov 2018 • Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Liqun Chen, Dinghan Shen, Guoyin Wang, Lawrence Carin

Sequence generation with reinforcement learning (RL) has received significant attention recently.

Reinforcement Learning (RL)

Paper
Add Code

JointGAN: Multi-Domain Joint Distribution Learning with Generative Adversarial Nets

2 code implementations • ICML 2018 • Yunchen Pu, Shuyang Dai, Zhe Gan, Wei-Yao Wang, Guoyin Wang, Yizhe Zhang, Ricardo Henao, Lawrence Carin

Distinct from most existing approaches, that only learn conditional distributions, the proposed model aims to learn a joint distribution of multiple random variables (domains).

Generative Adversarial Network

Paper
Code

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

2 code implementations • ACL 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, Lawrence Carin

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations.

Ranked #1 on Named Entity Recognition (NER) on CoNLL 2000

Document Classification General Classification +4

284

Paper
Code

NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing

1 code implementation • ACL 2018 • Dinghan Shen, Qinliang Su, Paidamoyo Chapfuwa, Wenlin Wang, Guoyin Wang, Lawrence Carin, Ricardo Henao

Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems.

Information Retrieval Retrieval +1

Paper
Code

Joint Embedding of Words and Labels for Text Classification

2 code implementations • ACL 2018 • Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin

Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.

Ranked #11 on Text Classification on DBpedia

General Classification Sentiment Analysis +2

323

Paper
Code

On the Use of Word Embeddings Alone to Represent Natural Language Sequences

no code implementations • ICLR 2018 • Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Ricardo Henao, Lawrence Carin

In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models.

Sentence Word Embeddings

Paper
Add Code

Non-iterative Label Propagation in Optimal Leading Forest

no code implementations • 25 Sep 2017 • Ji Xu, Guoyin Wang

We propose a sound assumption, arguing that: the neighboring data points are not in peer-to-peer relation, but in a partial-ordered relation induced by the local density and distance between the data; and the label of a center can be regarded as the contribution of its followers.

graph construction Relation

Paper
Add Code

Deconvolutional Paragraph Representation Learning

4 code implementations • NeurIPS 2017 • Yizhe Zhang, Dinghan Shen, Guoyin Wang, Zhe Gan, Ricardo Henao, Lawrence Carin

Learning latent representations from long text sequences is an important first step in many natural language processing applications.

General Classification Representation Learning +1

151

Paper
Code

Self-training semi-supervised classification based on density peaks of data

no code implementations • Neurocomputing 2017 • Di wu, Mingsheng Shang, Xin Luo a, Ji Xu, Huyong Yan, Weihui Deng, Guoyin Wang

Having a multitude of unlabeled data and few labeled ones is a common problem in many practical ap- plications.

Paper
Add Code

Heuristic algorithms for finding distribution reducts in probabilistic rough set model

no code implementations • 22 Dec 2015 • Xi'ao Ma, Guoyin Wang, Hong Yu

This is partly due to the fact that there are no monotonic fitness functions that are used to design heuristic attribute reduction algorithms in probabilistic rough set model.

Attribute

Paper
Add Code

Leading Tree in DPCLUS and Its Impact on Building Hierarchies

no code implementations • 12 Jun 2015 • Ji Xu, Guoyin Wang

There are two major advantages with the LT: One is dramatically reducing the running time of assigning noncenter data points to their cluster ID, because the assigning process is turned into just disconnecting the links from each center to its parent.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.