Search Results for author: Zhilin Yang

Found 40 papers, 27 papers with code

Multi-Modal Bayesian Embeddings for Learning Social Knowledge Graphs

no code implementations4 Aug 2015 Zhilin Yang, Jie Tang, William Cohen

GenVector leverages large-scale unlabeled data with embeddings and represents data of two modalities---i. e., social network users and knowledge concepts---in a shared latent topic space.

Knowledge Graphs

Words or Characters? Fine-grained Gating for Reading Comprehension

1 code implementation6 Nov 2016 Zhilin Yang, Bhuwan Dhingra, Ye Yuan, Junjie Hu, William W. Cohen, Ruslan Salakhutdinov

Previous work combines word-level and character-level representations using concatenation or scalar weighting, which is suboptimal for high-level tasks like reading comprehension.

Question Answering Reading Comprehension +1

Semi-Supervised QA with Generative Domain-Adaptive Nets

no code implementations ACL 2017 Zhilin Yang, Junjie Hu, Ruslan Salakhutdinov, William W. Cohen

In this framework, we train a generative model to generate questions based on the unlabeled text, and combine model-generated questions with human-generated questions for training question answering models.

Domain Adaptation Question Answering +2

A Probabilistic Framework for Location Inference from Social Media

no code implementations23 Feb 2017 Yujie Qian, Jie Tang, Zhilin Yang, Binxuan Huang, Wei Wei, Kathleen M. Carley

In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM).

Management

Differentiable Learning of Logical Rules for Knowledge Base Reasoning

2 code implementations NeurIPS 2017 Fan Yang, Zhilin Yang, William W. Cohen

We propose a framework, Neural Logic Programming, that combines the parameter and structure learning of first-order logical rules in an end-to-end differentiable model.

Linguistic Knowledge as Memory for Recurrent Neural Networks

no code implementations7 Mar 2017 Bhuwan Dhingra, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

We introduce a model that encodes such graphs as explicit memory in recurrent neural networks, and use it to model coreference relations in text.

LAMBADA

Good Semi-supervised Learning that Requires a Bad GAN

1 code implementation NeurIPS 2017 Zihang Dai, Zhilin Yang, Fan Yang, William W. Cohen, Ruslan Salakhutdinov

Semi-supervised learning methods based on generative adversarial networks (GANs) obtained strong empirical results, but it is not clear 1) how the discriminator benefits from joint training with a generator, and 2) why good semi-supervised classification performance and a good generator cannot be obtained at the same time.

General Classification Semi-Supervised Image Classification

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

9 code implementations ICLR 2018 Zhilin Yang, Zihang Dai, Ruslan Salakhutdinov, William W. Cohen

We formulate language modeling as a matrix factorization problem, and show that the expressiveness of Softmax-based models (including the majority of neural language models) is limited by a Softmax bottleneck.

Language Modelling Vocal Bursts Intensity Prediction +1

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

no code implementations ICLR 2018 Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston

Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment.

Grounded language learning

Neural Models for Reasoning over Multiple Mentions using Coreference

no code implementations NAACL 2018 Bhuwan Dhingra, Qiao Jin, Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov

Many problems in NLP require aggregating information from multiple mentions of the same entity which may be far apart in the text.

LAMBADA

GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

1 code implementation14 Jun 2018 Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Neural Cross-Lingual Named Entity Recognition with Minimal Resources

1 code implementation EMNLP 2018 Jiateng Xie, Zhilin Yang, Graham Neubig, Noah A. Smith, Jaime Carbonell

To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order.

named-entity-recognition Named Entity Recognition +2

GLoMo: Unsupervised Learning of Transferable Relational Graphs

no code implementations NeurIPS 2018 Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

35 code implementations ACL 2019 Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.

Language Modelling

XLNet: Generalized Autoregressive Pretraining for Language Understanding

23 code implementations NeurIPS 2019 Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Audio Question Answering Chinese Reading Comprehension +9

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

9 code implementations ACL 2022 Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang

On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Ranked #4 on Language Modelling on WikiText-103 (using extra training data)

Abstractive Text Summarization Classification +4

GPT Understands, Too

7 code implementations18 Mar 2021 Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang

Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU).

Knowledge Probing Language Modelling +2

FastMoE: A Fast Mixture-of-Expert Training System

3 code implementations24 Mar 2021 Jiaao He, Jiezhong Qiu, Aohan Zeng, Zhilin Yang, Jidong Zhai, Jie Tang

However, training trillion-scale MoE requires algorithm and system co-design for a well-tuned high performance distributed training system.

Language Modelling

VeniBot: Towards Autonomous Venipuncture with Automatic Puncture Area and Angle Regression from NIR Images

no code implementations27 May 2021 Xu Cao, Zijie Chen, Bolin Lai, Yuxuan Wang, Yu Chen, Zhengqing Cao, Zhilin Yang, Nanyang Ye, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

For the automation, we focus on the positioning part and propose a Dual-In-Dual-Out network based on two-step learning and two-task learning, which can achieve fully automatic regression of the suitable puncture area and angle from near-infrared(NIR) images.

Navigate regression

Distribution Matching for Rationalization

1 code implementation1 Jun 2021 Yongfeng Huang, Yujun Chen, Yulun Du, Zhilin Yang

The task of rationalization aims to extract pieces of input text as rationales to justify neural network predictions on text classification tasks.

text-classification Text Classification

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

4 code implementations14 Oct 2021 Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

1 code implementation7 Nov 2021 Xingcheng Yao, Yanan Zheng, Xiaocong Yang, Zhilin Yang

Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train.

Language Modelling

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

no code implementations18 Jan 2022 Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, Zhilin Yang

We propose a multitask pretraining approach ZeroPrompt for zero-shot generalization, focusing on task scaling and zero-shot prompting.

Zero-shot Generalization Zero-Shot Learning

GPS: Genetic Prompt Search for Efficient Few-shot Learning

1 code implementation31 Oct 2022 Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, Zhilin Yang

Prompt-based techniques have demostrated great potential for improving the few-shot generalization of pretrained language models.

Few-Shot Learning

Prompt-Based Metric Learning for Few-Shot NER

1 code implementation8 Nov 2022 Yanru Chen, Yanan Zheng, Zhilin Yang

Few-shot named entity recognition (NER) targets generalizing to unseen labels and/or domains with few labeled examples.

few-shot-ner Few-shot NER +4

Zero-Label Prompt Selection

2 code implementations9 Nov 2022 Chonghua Liao, Yanan Zheng, Zhilin Yang

Natural language prompts have been shown to facilitate cross-task generalization for large language models.

A Universal Discriminator for Zero-Shot Generalization

1 code implementation15 Nov 2022 Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang

In the finetuning setting, our approach also achieves new state-of-the-art results on a wide range of NLP tasks, with only 1/4 parameters of previous methods.

Zero-shot Generalization

Learning to Detect Noisy Labels Using Model-Based Features

1 code implementation28 Dec 2022 Zhihao Wang, Zongyu Lin, Peiqi Liu, Guidong Zheng, Junjie Wen, Xianxin Chen, Yujun Chen, Zhilin Yang

Label noise is ubiquitous in various machine learning scenarios such as self-labeling with model predictions and erroneous data annotation.

Meta-Learning speech-recognition +3

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

2 code implementations30 Mar 2023 Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, Lei Shen, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang

Large pre-trained code generation models, such as OpenAI Codex, can generate syntax- and function-correct code, making the coding of programmers more productive and our pursuit of artificial general intelligence closer.

Code Generation

P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks

no code implementations ACL 2022 Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.