Search Results for author: Lingpeng Kong

Found 102 papers, 66 papers with code

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

1 code implementation21 Mar 2024 Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, XiaoLi Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains.

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

1 code implementation5 Mar 2024 Xijia Tao, Shuai Zhong, Lei LI, Qi Liu, Lingpeng Kong

In this paper, we propose a novel jailbreaking attack against VLMs, aiming to bypass their safety barrier when a user inputs harmful instructions.

GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers

1 code implementation29 Feb 2024 Qintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi

Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.

GSM8K Math +1

Training-Free Long-Context Scaling of Large Language Models

1 code implementation27 Feb 2024 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.

LoRA Meets Dropout under a Unified Framework

no code implementations25 Feb 2024 Sheng Wang, Liheng Chen, Jiyue Jiang, Boyang Xue, Lingpeng Kong, Chuan Wu

Hence, a possible contradiction arises from negligible trainable parameters of LoRA and the effectiveness of previous dropout methods, which has been largely overlooked.

PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA

no code implementations24 Feb 2024 Sheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu

Hopefully, the conspicuously higher parameter efficiency can establish PRoLoRA as a resource-friendly alternative to LoRA.

Empowering Large Language Model Agents through Action Learning

1 code implementation24 Feb 2024 Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior.

Language Modelling Large Language Model

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

1 code implementation12 Feb 2024 Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong

Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents.

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

1 code implementation12 Feb 2024 Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Zhenguo Li, Wei Bi, Lingpeng Kong

This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models.

Math

AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents

2 code implementations24 Jan 2024 Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He

Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications.

Benchmarking

G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

1 code implementation18 Dec 2023 Jiahui Gao, Renjie Pi, Jipeng Zhang, Jiacheng Ye, Wanjun Zhong, YuFei Wang, Lanqing Hong, Jianhua Han, Hang Xu, Zhenguo Li, Lingpeng Kong

We first analyze the limitations of current Multimodal Large Language Models (MLLMs) in this area: they struggle to accurately comprehending basic geometric elements and their relationships.

Language Modelling Large Language Model

Linear Attention via Orthogonal Memory

no code implementations18 Dec 2023 Jun Zhang, Shuyang Jiang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Given that orthogonal memory compresses global information, we further dissect the context to amplify fine-grained local information.

Causal Language Modeling Computational Efficiency +1

Silkie: Preference Distillation for Large Visual Language Models

no code implementations17 Dec 2023 Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong

This paper explores preference distillation for large vision language models (LVLMs), improving their ability to generate helpful and faithful responses anchoring the visual context.

Hallucination Visual Question Answering

Self-Infilling Code Generation

1 code implementation29 Nov 2023 Lin Zheng, Jianbo Yuan, Zhi Zhang, Hongxia Yang, Lingpeng Kong

This work introduces self-infilling code generation, a general framework that incorporates infilling operations into auto-regressive decoding.

Code Generation

Collaborative Evaluation: Exploring the Synergy of Large Language Models and Humans for Open-ended Generation Evaluation

1 code implementation30 Oct 2023 Qintong Li, Leyang Cui, Lingpeng Kong, Wei Bi

To explore the synergy between humans and LLM-based evaluators and address the challenges of existing inconsistent evaluation criteria in open-ended NLG tasks, we propose a Collaborative Evaluation pipeline CoEval, involving the design of a checklist of task-specific criteria and the detailed evaluation of texts, in which LLM generates initial ideation, and then humans engage in scrutiny.

Text Generation

SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

no code implementations19 Oct 2023 Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong

Large Language Models (LLMs) have driven substantial progress in artificial intelligence in recent years, exhibiting impressive capabilities across a wide range of tasks, including mathematical problem-solving.

GSM8K Math

Attentive Multi-Layer Perceptron for Non-autoregressive Generation

1 code implementation14 Oct 2023 Shuyang Jiang, Jun Zhang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Furthermore, we marry AMLP with popular NAR models, deriving a highly efficient NAR-AMLP architecture with linear time and space complexity.

Machine Translation Speech Synthesis +1

Lemur: Harmonizing Natural Language and Code for Language Agents

1 code implementation10 Oct 2023 Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents.

Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration

1 code implementation30 Sep 2023 Qiushi Sun, Zhangyue Yin, Xiang Li, Zhiyong Wu, Xipeng Qiu, Lingpeng Kong

Large Language Models (LLMs) are evolving at an unprecedented pace and have exhibited considerable capability in the realm of natural language processing (NLP) with world knowledge.

World Knowledge

Extrapolating Large Language Models to Non-English by Aligning Languages

2 code implementations9 Aug 2023 Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI

We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i. e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data.

Translation

L-Eval: Instituting Standardized Evaluation for Long Context Language Models

3 code implementations20 Jul 2023 Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu

Recently, there has been growing interest in extending the context length of large language models (LLMs), aiming to effectively process long inputs of one turn or conversations with more extensive histories.

Instruction Following

Linearized Relative Positional Encoding

no code implementations18 Jul 2023 Zhen Qin, Weixuan Sun, Kaiyue Lu, Hui Deng, Dongxu Li, Xiaodong Han, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Meanwhile, it emphasizes a general paradigm for designing broadly more relative positional encoding methods that are applicable to linear transformers.

Image Classification Language Modelling +2

Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability

1 code implementation11 Jun 2023 Jiacheng Ye, Xijia Tao, Lingpeng Kong

First, does multilingual transfer ability exist in English-centric models and how does it compare with multilingual pretrained models?

INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation

1 code implementation10 Jun 2023 Wenhao Zhu, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen

We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.

Machine Translation Translation

M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

no code implementations7 Jun 2023 Lei LI, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu sun, Lingpeng Kong, Qi Liu

To tackle this challenge and promote research in the vision-language field, we introduce the Multi-Modal, Multilingual Instruction Tuning (M$^3$IT) dataset, designed to optimize VLM alignment with human instructions.

World Knowledge

Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving

1 code implementation25 May 2023 Xueliang Zhao, Wenda Li, Lingpeng Kong

Large language models~(LLMs) present an intriguing avenue of exploration in the domain of formal theorem proving.

Ranked #3 on Automated Theorem Proving on miniF2F-test (Pass@100 metric)

Automated Theorem Proving

Optimizing Non-Autoregressive Transformers with Contrastive Learning

no code implementations23 May 2023 Chenxin An, Jiangtao Feng, Fei Huang, Xipeng Qiu, Lingpeng Kong

In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.

Contrastive Learning Machine Translation +2

Can Language Models Understand Physical Concepts?

1 code implementation23 May 2023 Lei LI, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu sun

Language models~(LMs) gradually become general-purpose interfaces in the interactive and embodied world, where the understanding of physical concepts is an essential prerequisite.

DetGPT: Detect What You Need via Reasoning

1 code implementation23 May 2023 Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang

Overall, our proposed paradigm and DetGPT demonstrate the potential for more sophisticated and intuitive interactions between humans and machines.

Autonomous Driving Object +2

Generating Data for Symbolic Language with Large Language Models

1 code implementation23 May 2023 Jiacheng Ye, Chengzu Li, Lingpeng Kong, Tao Yu

However, such an approach has primarily been applied to natural language tasks and has not yet been explored for symbolic language tasks with complex structured outputs (e. g., semantic parsing and code generation).

Code Generation Semantic Parsing

A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment

no code implementations14 May 2023 Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu

In this paper, we propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the CS principle and emotional support strategy.

Toeplitz Neural Network for Sequence Modeling

2 code implementations8 May 2023 Zhen Qin, Xiaodong Han, Weixuan Sun, Bowen He, Dong Li, Dongxu Li, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Sequence modeling has important applications in natural language processing and computer vision.

Language Modelling Position

A Challenging Benchmark for Low-Resource Learning

1 code implementation7 Mar 2023 Yudong Wang, Chang Ma, Qingxiu Dong, Lingpeng Kong, Jingjing Xu

Experiments on a wide range of models show that neural networks, even pre-trained language models, have sharp performance drops on our benchmark, demonstrating the effectiveness on evaluating the weaknesses of neural networks.

Retrieved Sequence Augmentation for Protein Representation Learning

1 code implementation24 Feb 2023 Chang Ma, Haiteng Zhao, Lin Zheng, Jiayi Xin, Qintong Li, Lijun Wu, Zhihong Deng, Yang Lu, Qi Liu, Lingpeng Kong

RSA links query protein sequences to a set of sequences with similar structures or properties in the database and combines these sequences for downstream prediction.

Property Prediction Representation Learning +1

A Reparameterized Discrete Diffusion Model for Text Generation

1 code implementation11 Feb 2023 Lin Zheng, Jianbo Yuan, Lei Yu, Lingpeng Kong

This work studies discrete diffusion probabilistic models with applications to natural language generation.

Text Generation

Compositional Exemplars for In-context Learning

1 code implementation11 Feb 2023 Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong

The performance of ICL is highly dominated by the quality of the selected in-context examples.

Code Generation Contrastive Learning +6

Efficient Attention via Control Variates

1 code implementation9 Feb 2023 Lin Zheng, Jianbo Yuan, Chong Wang, Lingpeng Kong

Built upon previous progress of RFA, we characterize this gap through the lens of control variates and show that RFA can be decomposed into a sum of multiple control variate estimators for each element in the sequence.

In-Context Learning with Many Demonstration Examples

1 code implementation9 Feb 2023 Mukai Li, Shansan Gong, Jiangtao Feng, Yiheng Xu, Jun Zhang, Zhiyong Wu, Lingpeng Kong

Based on EVALM, we scale up the size of examples efficiently in both instruction tuning and in-context learning to explore the boundary of the benefits from more annotated data.

In-Context Learning Language Modelling

Audio-Visual Segmentation with Semantics

1 code implementation30 Jan 2023 Jinxing Zhou, Xuyang Shen, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Segmentation Semantic Segmentation +1

Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering

1 code implementation20 Dec 2022 Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong

Despite the surprising few-shot performance of in-context learning (ICL), it is still a common practice to randomly sample examples to serve as context.

In-Context Learning

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

1 code implementation20 Dec 2022 Fei Yuan, Yinquan Lu, Wenhao Zhu, Lingpeng Kong, Lei LI, Yu Qiao, Jingjing Xu

To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT.

Machine Translation Translation

Explanation Regeneration via Information Bottleneck

1 code implementation19 Dec 2022 Qintong Li, Zhiyong Wu, Lingpeng Kong, Wei Bi

Explaining the black-box predictions of NLP models naturally and accurately is an important open problem in natural language generation.

Explanation Generation Language Modelling +2

Unsupervised Explanation Generation via Correct Instantiations

no code implementations21 Nov 2022 Sijie Cheng, Zhiyong Wu, Jiangjie Chen, Zhixing Li, Yang Liu, Lingpeng Kong

The major difficulty is finding the conflict point, where the statement contradicts our real world.

Explanation Generation

An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks

1 code implementation24 Oct 2022 Changlong Yu, Tianyi Xiao, Lingpeng Kong, Yangqiu Song, Wilfred Ng

Though linguistic knowledge emerges during large-scale language model pretraining, recent work attempt to explicitly incorporate human-defined linguistic priors into task-specific fine-tuning.

Language Modelling

ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback

2 code implementations22 Oct 2022 Jiacheng Ye, Jiahui Gao, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong

To improve the quality of dataset synthesis, we propose a progressive zero-shot dataset generation framework, ProGen, which leverages the feedback from the task-specific model to guide the generation of new training data via in-context examples.

Informativeness text-classification +2

The Devil in Linear Transformer

1 code implementation19 Oct 2022 Zhen Qin, Xiaodong Han, Weixuan Sun, Dongxu Li, Lingpeng Kong, Nick Barnes, Yiran Zhong

In this paper, we examine existing kernel-based linear transformers and identify two key issues that lead to such performance gaps: 1) unbounded gradients in the attention computation adversely impact the convergence of linear transformer models; 2) attention dilution which trivially distributes attention scores over long sequences while neglecting neighbouring structures.

Language Modelling Text Classification

DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

1 code implementation17 Oct 2022 Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong

Bringing together theoretical analysis and empirical evidence, we demonstrate the great potential of diffusion models in complex conditional language generation tasks.

Text Generation

CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling

1 code implementation14 Oct 2022 Jun Zhang, Shuyang Jiang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

In this paper, we propose Comprehensive Attention Benchmark (CAB) under a fine-grained attention taxonomy with four distinguishable attention patterns, namely, noncausal self, causal self, noncausal cross, and causal cross attentions.

Benchmarking Long-range modeling

Audio-Visual Segmentation

1 code implementation11 Jul 2022 Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Segmentation

Vicinity Vision Transformer

1 code implementation21 Jun 2022 Weixuan Sun, Zhen Qin, Hui Deng, Jianyuan Wang, Yi Zhang, Kaihao Zhang, Nick Barnes, Stan Birchfield, Lingpeng Kong, Yiran Zhong

Based on this observation, we present a Vicinity Attention that introduces a locality bias to vision transformers with linear complexity.

Image Classification

CoNT: Contrastive Neural Text Generation

2 code implementations29 May 2022 Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Code Comment Generation Comment Generation +4

Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning

2 code implementations25 May 2022 Jiahui Gao, Renjie Pi, Yong Lin, Hang Xu, Jiacheng Ye, Zhiyong Wu, Weizhong Zhang, Xiaodan Liang, Zhenguo Li, Lingpeng Kong

In this paradigm, the synthesized data from the PLM acts as the carrier of knowledge, which is used to train a task-specific model with orders of magnitude fewer parameters than the PLM, achieving both higher performance and efficiency than prompt-based zero-shot learning methods on PLMs.

text-classification Text Classification +1

Language Models Can See: Plugging Visual Controls in Text Generation

1 code implementation5 May 2022 Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lingpeng Kong, Nigel Collier

MAGIC is a flexible framework and is theoretically compatible with any text generation tasks that incorporate image grounding.

Image Captioning Image-text matching +3

Lexical Knowledge Internalization for Neural Dialog Generation

1 code implementation ACL 2022 Zhiyong Wu, Wei Bi, Xiang Li, Lingpeng Kong, Ben Kao

We propose knowledge internalization (KI), which aims to complement the lexical knowledge into neural dialog models.

Contrastive Learning

Event Transition Planning for Open-ended Text Generation

1 code implementation Findings (ACL) 2022 Qintong Li, Piji Li, Wei Bi, Zhaochun Ren, Yuxuan Lai, Lingpeng Kong

Open-ended text generation tasks, such as dialogue generation and story completion, require models to generate a coherent continuation given limited preceding context.

Dialogue Generation Story Completion

Linear Complexity Randomized Self-attention Mechanism

1 code implementation10 Apr 2022 Lin Zheng, Chong Wang, Lingpeng Kong

By combining the expressiveness in RA and the efficiency in RFA, we develop a novel linear complexity self-attention mechanism called linear randomized attention (LARA).

cosFormer: Rethinking Softmax in Attention

3 code implementations ICLR 2022 Zhen Qin, Weixuan Sun, Hui Deng, Dongxu Li, Yunshen Wei, Baohong Lv, Junjie Yan, Lingpeng Kong, Yiran Zhong

As one of its core components, the softmax attention helps to capture long-range dependencies yet prohibits its scale-up due to the quadratic space and time complexity to the sequence length.

D4RL Language Modelling +1

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

3 code implementations16 Feb 2022 Jiacheng Ye, Jiahui Gao, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs).

Knowledge Distillation Natural Language Inference +5

A Contrastive Framework for Neural Text Generation

2 code implementations13 Feb 2022 Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, Nigel Collier

Text generation is of great importance to many natural language processing applications.

Text Generation

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

1 code implementation16 Jan 2022 Hao Wang, Yangguang Li, Zhen Huang, Yong Dou, Lingpeng Kong, Jing Shao

To alleviate feature suppression, we propose contrastive learning for unsupervised sentence embedding with soft negative samples (SNCSE).

Contrastive Learning Data Augmentation +7

Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

1 code implementation NAACL 2022 Jakob Prange, Nathan Schneider, Lingpeng Kong

We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.

Language Modelling

Ripple Attention for Visual Perception with Sub-quadratic Complexity

no code implementations6 Oct 2021 Lin Zheng, Huijie Pan, Lingpeng Kong

Transformer architectures are now central to sequence modeling tasks.

Cascaded Head-colliding Attention

1 code implementation ACL 2021 Lin Zheng, Zhiyong Wu, Lingpeng Kong

Transformers have advanced the field of natural language processing (NLP) on a variety of important tasks.

Language Modelling Machine Translation +1

Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation

no code implementations ACL 2021 Zhiyong Wu, Lingpeng Kong, Wei Bi, Xiang Li, Ben Kao

A neural multimodal machine translation (MMT) system is one that aims to perform better translation by extending conventional text-only translation models with multimodal information.

Multimodal Machine Translation Translation

Random Feature Attention

no code implementations ICLR 2021 Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah A. Smith, Lingpeng Kong

RFA can be used as a drop-in replacement for conventional softmax attention and offers a straightforward way of learning with recency bias through an optional gating mechanism.

Language Modelling Machine Translation +3

Adaptive Semiparametric Language Models

no code implementations4 Feb 2021 Dani Yogatama, Cyprien de Masson d'Autume, Lingpeng Kong

We present a language model that combines a large parametric neural network (i. e., a transformer) with a non-parametric episodic memory component in an integrated architecture.

Language Modelling

Good for Misconceived Reasons: Revisiting Neural Multimodal Machine Translation

no code implementations1 Jan 2021 Zhiyong Wu, Lingpeng Kong, Ben Kao

A neural multimodal machine translation (MMT) system is one that aims to perform better translation by extending conventional text-only translation models with multimodal information.

Multimodal Machine Translation Translation

Syntactic Structure Distillation Pretraining For Bidirectional Encoders

no code implementations27 May 2020 Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried, Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well on challenging tests of syntactic competence.

Knowledge Distillation Language Modelling +3

A Mutual Information Maximization Perspective of Language Representation Learning

no code implementations ICLR 2020 Lingpeng Kong, Cyprien de Masson d'Autume, Wang Ling, Lei Yu, Zihang Dai, Dani Yogatama

We show state-of-the-art word representation learning methods maximize an objective function that is a lower bound on the mutual information between different parts of a word sequence (i. e., a sentence).

Representation Learning Sentence

Better Document-Level Machine Translation with Bayes' Rule

no code implementations TACL 2020 Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available.

Document Level Machine Translation Document Translation +4

Relative Pixel Prediction For Autoregressive Image Generation

no code implementations25 Sep 2019 Wang Ling, Chris Dyer, Lei Yu, Lingpeng Kong, Dani Yogatama, Susannah Young

In natural images, transitions between adjacent pixels tend to be smooth and gradual, a fact that has long been exploited in image compression models based on predictive coding.

Colorization Image Colorization +4

Putting Machine Translation in Context with the Noisy Channel Model

no code implementations25 Sep 2019 Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides a compelling mechanism for controlling unconditional document language models, using the long-standing challenge of effectively leveraging document context in machine translation.

Document Translation Language Modelling +3

Episodic Memory in Lifelong Language Learning

2 code implementations NeurIPS 2019 Cyprien de Masson d'Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama

We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier.

Continual Learning General Classification +3

Learning and Evaluating General Linguistic Intelligence

no code implementations31 Jan 2019 Dani Yogatama, Cyprien de Masson d'Autume, Jerome Connor, Tomas Kocisky, Mike Chrzanowski, Lingpeng Kong, Angeliki Lazaridou, Wang Ling, Lei Yu, Chris Dyer, Phil Blunsom

We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly.

Natural Language Understanding Question Answering

Variational Smoothing in Recurrent Neural Network Language Models

no code implementations ICLR 2019 Lingpeng Kong, Gabor Melis, Wang Ling, Lei Yu, Dani Yogatama

We present a new theoretical perspective of data noising in recurrent neural network language models (Xie et al., 2017).

Language Modelling

Neural Phrase-to-Phrase Machine Translation

no code implementations6 Nov 2018 Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018).

Machine Translation Translation

End-to-End Neural Segmental Models for Speech Recognition

no code implementations1 Aug 2017 Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals

Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time.

speech-recognition Speech Recognition

Multitask Learning with CTC and Segmental CRF for Speech Recognition

no code implementations21 Feb 2017 Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models.

speech-recognition Speech Recognition

DyNet: The Dynamic Neural Network Toolkit

4 code implementations15 Jan 2017 Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

What Do Recurrent Neural Network Grammars Learn About Syntax?

1 code implementation EACL 2017 Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith

We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection.

Constituency Parsing Dependency Parsing +1

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

no code implementations1 Mar 2016 Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals

This model connects the segmental conditional random field (CRF) with a recurrent neural network (RNN) used for feature extraction.

Acoustic Modelling Language Modelling +2

Segmental Recurrent Neural Networks

2 code implementations18 Nov 2015 Lingpeng Kong, Chris Dyer, Noah A. Smith

Representations of the input segments (i. e., contiguous subsequences of the input) are computed by encoding their constituent tokens using bidirectional recurrent neural nets, and these "segment embeddings" are used to define compatibility scores with output labels.

Chinese Word Segmentation Handwriting Recognition +2

Document Context Language Models

1 code implementation12 Nov 2015 Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure.

Sentence

An Empirical Comparison of Parsing Methods for Stanford Dependencies

no code implementations16 Apr 2014 Lingpeng Kong, Noah A. Smith

Stanford typed dependencies are a widely desired representation of natural language sentences, but parsing is one of the major computational bottlenecks in text analysis systems.

Dependency Parsing

Cannot find the paper you are looking for? You can Submit a new open access paper.