Search Results for author: Zhilin Yang

Found 40 papers, 27 papers with code

Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs

no code implementations • 4 Aug 2015 • Zhilin Yang, Jie Tang, William Cohen

GenVector leverages large-scale unlabeled data with embeddings and represents data of two modalities---i. e., social network users and knowledge concepts---in a shared latent topic space.

Knowledge Graphs

Paper
Add Code

Multi-Task Cross-Lingual Sequence Tagging from Scratch

no code implementations • 20 Mar 2016 • Zhilin Yang, Ruslan Salakhutdinov, William Cohen

We present a deep hierarchical recurrent neural network for sequence tagging.

Chunking Feature Engineering +3

Paper
Add Code

Revisiting Semi-Supervised Learning with Graph Embeddings

25 code implementations • 29 Mar 2016 • Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

We present a semi-supervised learning framework based on graph embeddings.

Ranked #2 on Node Classification on USA Air-Traffic

Document Classification Entity Extraction using GAN +2

7,003

Paper
Code

Review Networks for Caption Generation

no code implementations • NeurIPS 2016 • Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, William W. Cohen

We propose a novel extension of the encoder-decoder framework, called a review network.

Caption Generation Image Captioning

Paper
Add Code

Gated-Attention Readers for Text Comprehension

4 code implementations • ACL 2017 • Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

In this paper we study the problem of answering cloze-style questions over documents.

Ranked #1 on Question Answering on Children's Book Test

Answer Selection Open-Domain Question Answering +1

436

Paper
Code

Words or Characters? Fine-grained Gating for Reading Comprehension

1 code implementation • 6 Nov 2016 • Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, Ruslan Salakhutdinov

Previous work combines word-level and character-level representations using concatenation or scalar weighting, which is suboptimal for high-level tasks like reading comprehension.

Ranked #50 on Question Answering on SQuAD1.1 dev

Question Answering Reading Comprehension +1

Paper
Code

Semi-Supervised QA with Generative Domain-Adaptive Nets

no code implementations • ACL 2017 • Zhilin Yang, Junjie Hu, Ruslan Salakhutdinov, William W. Cohen

In this framework, we train a generative model to generate questions based on the unlabeled text, and combine model-generated questions with human-generated questions for training question answering models.

Domain Adaptation Question Answering +2

Paper
Add Code

A Probabilistic Framework for Location Inference from Social Media

no code implementations • 23 Feb 2017 • Yujie Qian, Jie Tang, Zhilin Yang, Binxuan Huang, Wei Wei, Kathleen M. Carley

In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM).

Management

Paper
Add Code

Differentiable Learning of Logical Rules for Knowledge Base Reasoning

2 code implementations • NeurIPS 2017 • Fan Yang, Zhilin Yang, William W. Cohen

We propose a framework, Neural Logic Programming, that combines the parameter and structure learning of first-order logical rules in an end-to-end differentiable model.

208

Paper
Code

Linguistic Knowledge as Memory for Recurrent Neural Networks

no code implementations • 7 Mar 2017 • Bhuwan Dhingra, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

We introduce a model that encodes such graphs as explicit memory in recurrent neural networks, and use it to model coreference relations in text.

Ranked #1 on Question Answering on CNN / Daily Mail

LAMBADA

Paper
Add Code

Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks

4 code implementations • 18 Mar 2017 • Zhilin Yang, Ruslan Salakhutdinov, William W. Cohen

Recent papers have shown that neural networks obtain state-of-the-art performance on several different sequence tagging tasks.

Ranked #10 on Part-Of-Speech Tagging on Penn Treebank

Feature Engineering Named Entity Recognition (NER) +4

1,878

Paper
Code

Good Semi-supervised Learning that Requires a Bad GAN

1 code implementation • NeurIPS 2017 • Zihang Dai, Zhilin Yang, Fan Yang, William W. Cohen, Ruslan Salakhutdinov

Semi-supervised learning methods based on generative adversarial networks (GANs) obtained strong empirical results, but it is not clear 1) how the discriminator benefits from joint training with a generator, and 2) why good semi-supervised classification performance and a good generator cannot be obtained at the same time.

Ranked #42 on Semi-Supervised Image Classification on CIFAR-10, 4000 Labels

General Classification Semi-Supervised Image Classification

181

Paper
Code

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

9 code implementations • ICLR 2018 • Zhilin Yang, Zihang Dai, Ruslan Salakhutdinov, William W. Cohen

We formulate language modeling as a matrix factorization problem, and show that the expressiveness of Softmax-based models (including the majority of neural language models) is limited by a Softmax bottleneck.

Ranked #11 on Language Modelling on Penn Treebank (Word Level)

Language Modelling Vocal Bursts Intensity Prediction +1

392

Paper
Code

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

no code implementations • ICLR 2018 • Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston

Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment.

Grounded language learning

Paper
Add Code

Neural Models for Reasoning over Multiple Mentions using Coreference

no code implementations • NAACL 2018 • Bhuwan Dhingra, Qiao Jin, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

Many problems in NLP require aggregating information from multiple mentions of the same entity which may be far apart in the text.

Ranked #7 on Question Answering on WikiHop

LAMBADA

Paper
Add Code

GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

1 code implementation • 14 Jun 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Paper
Code

Neural Cross-Lingual Named Entity Recognition with Minimal Resources

1 code implementation • EMNLP 2018 • Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, Jaime Carbonell

To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order.

named-entity-recognition Named Entity Recognition +2

Paper
Code

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering

1 code implementation • EMNLP 2018 • Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, Christopher D. Manning

Existing question answering (QA) datasets fail to train QA systems to perform complex reasoning and provide explanations for answers.

Ranked #34 on Question Answering on HotpotQA

Multi-hop Question Answering Question Answering +1

400

Paper
Code

GLoMo: Unsupervised Learning of Transferable Relational Graphs

no code implementations • NeurIPS 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun

Image Classification Natural Language Inference +4

Paper
Add Code

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

35 code implementations • ACL 2019 • Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.

Ranked #3 on Language Modelling on One Billion Word

Language Modelling

125,059

Paper
Code

XLNet: Generalized Autoregressive Pretraining for Language Understanding

23 code implementations • NeurIPS 2019 • Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Ranked #1 on Question Answering on SQuAD2.0 dev

Audio Question Answering Chinese Reading Comprehension +9

125,059

Paper
Code

Mixtape: Breaking the Softmax Bottleneck Efficiently

no code implementations • NeurIPS 2019 • Zhilin Yang, Thang Luong, Russ R. Salakhutdinov, Quoc V. Le

The softmax bottleneck has been shown to limit the expressiveness of neural language models.

Language Modelling Machine Translation +2

Paper
Add Code

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

9 code implementations • ACL 2022 • Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang

On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Ranked #4 on Language Modelling on WikiText-103 (using extra training data)

Abstractive Text Summarization Classification +4

39,275

Paper
Code

GPT Understands, Too

7 code implementations • 18 Mar 2021 • Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang

Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU).

Knowledge Probing Language Modelling +2

11,420

Paper
Code

Controllable Generation from Pre-trained Language Models via Inverse Prompting

1 code implementation • 19 Mar 2021 • Xu Zou, Da Yin, Qingyang Zhong, Ming Ding, Hongxia Yang, Zhilin Yang, Jie Tang

To tackle this challenge, we propose an innovative method, inverse prompting, to better control text generation.

Language Modelling Long Form Question Answering +1

120

Paper
Code

FastMoE: A Fast Mixture-of-Expert Training System

3 code implementations • 24 Mar 2021 • Jiaao He, Jiezhong Qiu, Aohan Zeng, Zhilin Yang, Jidong Zhai, Jie Tang

However, training trillion-scale MoE requires algorithm and system co-design for a well-tuned high performance distributed training system.

Language Modelling

1,382

Paper
Code

VeniBot: Towards Autonomous Venipuncture with Automatic Puncture Area and Angle Regression from NIR Images

no code implementations • 27 May 2021 • Xu Cao, Zijie Chen, Bolin Lai, Yuxuan Wang, Yu Chen, Zhengqing Cao, Zhilin Yang, Nanyang Ye, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

For the automation, we focus on the positioning part and propose a Dual-In-Dual-Out network based on two-step learning and two-task learning, which can achieve fully automatic regression of the suitable puncture area and angle from near-infrared(NIR) images.

Navigate regression

Paper
Add Code

Distribution Matching for Rationalization

1 code implementation • 1 Jun 2021 • Yongfeng Huang, Yujun Chen, Yulun Du, Zhilin Yang

The task of rationalization aims to extract pieces of input text as rationales to justify neural network predictions on text classification tasks.

text-classification Text Classification

Paper
Code

FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning

1 code implementation • ACL 2022 • Jing Zhou, Yanan Zheng, Jie Tang, Jian Li, Zhilin Yang

Most previous methods for text data augmentation are limited to simple tasks and weak baselines.

Data Augmentation Few-Shot Learning +1

Paper
Code

FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding

1 code implementation • ACL 2022 • Yanan Zheng, Jing Zhou, Yujie Qian, Ming Ding, Chonghua Liao, Jian Li, Ruslan Salakhutdinov, Jie Tang, Sebastian Ruder, Zhilin Yang

The few-shot natural language understanding (NLU) task has attracted much recent attention.

Benchmarking Natural Language Understanding

Paper
Code

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

4 code implementations • 14 Oct 2021 • Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

2,454

Paper
Code

NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

1 code implementation • 7 Nov 2021 • Xingcheng Yao, Yanan Zheng, Xiaocong Yang, Zhilin Yang

Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train.

Language Modelling

253

Paper
Code

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

no code implementations • 18 Jan 2022 • Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, Zhilin Yang

We propose a multitask pretraining approach ZeroPrompt for zero-shot generalization, focusing on task scaling and zero-shot prompting.

Zero-shot Generalization Zero-Shot Learning

Paper
Add Code

GPS: Genetic Prompt Search for Efficient Few-shot Learning

1 code implementation • 31 Oct 2022 • Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, Zhilin Yang

Prompt-based techniques have demostrated great potential for improving the few-shot generalization of pretrained language models.

Few-Shot Learning

Paper
Code

Prompt-Based Metric Learning for Few-Shot NER

1 code implementation • 8 Nov 2022 • Yanru Chen, Yanan Zheng, Zhilin Yang

Few-shot named entity recognition (NER) targets generalizing to unseen labels and/or domains with few labeled examples.

few-shot-ner Few-shot NER +4

Paper
Code

Zero-Label Prompt Selection

2 code implementations • 9 Nov 2022 • Chonghua Liao, Yanan Zheng, Zhilin Yang

Natural language prompts have been shown to facilitate cross-task generalization for large language models.

Paper
Code

A Universal Discriminator for Zero-Shot Generalization

1 code implementation • 15 Nov 2022 • Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang

In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods.

Zero-shot Generalization

Paper
Code

Learning to Detect Noisy Labels Using Model-Based Features

1 code implementation • 28 Dec 2022 • Zhihao Wang, Zongyu Lin, Peiqi Liu, Guidong Zheng, Junjie Wen, Xianxin Chen, Yujun Chen, Zhilin Yang

Label noise is ubiquitous in various machine learning scenarios such as self-labeling with model predictions and erroneous data annotation.

Meta-Learning speech-recognition +3

Paper
Code

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

2 code implementations • 30 Mar 2023 • Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, Lei Shen, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang

Large pre-trained code generation models, such as OpenAI Codex, can generate syntax- and function-correct code, making the coding of programmers more productive and our pursuit of artificial general intelligence closer.

Ranked #81 on Code Generation on MBPP

Code Generation

7,772

Paper
Code

P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks

no code implementations • ACL 2022 • Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.