Search Results for author: Tao Ge

Found 56 papers, 24 papers with code

LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models

no code implementations • 1 Apr 2024 • Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan, Furu Wei

This paper presents a comprehensive survey of the current status and opportunities for Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning that necessitates understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly.

Decision Making

Paper
Add Code

K-Level Reasoning with Large Language Models

no code implementations • 2 Feb 2024 • Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Yan Xia, Man Lan, Furu Wei

While Large Language Models (LLMs) have demonstrated their proficiency in complex reasoning tasks, their performance in dynamic, interactive, and competitive scenarios - such as business strategy and stock market analysis - remains underexplored.

Decision Making

Paper
Add Code

Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding

1 code implementation • 15 Jan 2024 • Heming Xia, Zhe Yang, Qingxiu Dong, Peiyi Wang, Yongqi Li, Tao Ge, Tianyu Liu, Wenjie Li, Zhifang Sui

To mitigate the high inference latency stemming from autoregressive decoding in Large Language Models (LLMs), Speculative Decoding has emerged as a novel decoding paradigm for LLM inference.

Language Modelling Large Language Model

148

Paper
Code

ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic Decision-Making with AI Agents

1 code implementation • 6 Nov 2023 • Shaoguang Mao, Yuzhe Cai, Yan Xia, Wenshan Wu, Xun Wang, Fengyi Wang, Tao Ge, Furu Wei

This paper introduces Alympics (Olympics for Agents), a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.

Decision Making Language Modelling +1

Paper
Code

SCALE: Synergized Collaboration of Asymmetric Language Translation Engines

1 code implementation • 29 Sep 2023 • Xin Cheng, Xun Wang, Tao Ge, Si-Qing Chen, Furu Wei, Dongyan Zhao, Rui Yan

In this paper, we introduce SCALE, a collaborative framework that connects compact Specialized Translation Models (STMs) and general-purpose Large Language Models (LLMs) as one unified translation engine.

Continual Learning Translation

Paper
Code

In-context Autoencoder for Context Compression in a Large Language Model

1 code implementation • 13 Jul 2023 • Tao Ge, Jing Hu, Lei Wang, Xun Wang, Si-Qing Chen, Furu Wei

We propose the In-context Autoencoder (ICAE), leveraging the power of a large language models (LLM) to compress a long context into short compact memory slots that can be directly conditioned on by the LLM for various purposes.

Language Modelling Large Language Model +3

Paper
Code

Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

2 code implementations • 11 Jul 2023 • Zhenhailong Wang, Shaoguang Mao, Wenshan Wu, Tao Ge, Furu Wei, Heng Ji

In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas.

Hallucination Logic Grid Puzzle

2,776

Paper
Code

Smart Word Suggestions for Writing Assistance

1 code implementation • 17 May 2023 • Chenshuo Wang, Shaoguang Mao, Tao Ge, Wenshan Wu, Xun Wang, Yan Xia, Jonathan Tien, Dongyan Zhao

The training dataset comprises over 3. 7 million sentences and 12. 7 million suggestions generated through rules.

Paper
Code

Low-code LLM: Graphical User Interface over Large Language Models

2 code implementations • 17 Apr 2023 • Yuzhe Cai, Shaoguang Mao, Wenshan Wu, Zehua Wang, Yaobo Liang, Tao Ge, Chenfei Wu, Wang You, Ting Song, Yan Xia, Jonathan Tien, Nan Duan, Furu Wei

By introducing this framework, we aim to bridge the gap between humans and LLMs, enabling more effective and efficient utilization of LLMs for complex tasks.

Prompt Engineering

34,526

Paper
Code

Inference with Reference: Lossless Acceleration of Large Language Models

1 code implementation • 10 Apr 2023 • Nan Yang, Tao Ge, Liang Wang, Binxing Jiao, Daxin Jiang, Linjun Yang, Rangan Majumder, Furu Wei

We propose LLMA, an LLM accelerator to losslessly speed up Large Language Model (LLM) inference with references.

Language Modelling Large Language Model

3,179

Paper
Code

Semiparametric Language Models Are Scalable Continual Learners

no code implementations • 2 Mar 2023 • Guangyue Peng, Tao Ge, Si-Qing Chen, Furu Wei, Houfeng Wang

We demonstrate that SeMem improves the scalability of semiparametric LMs for continual learning over streaming data in two ways: (1) data-wise scalability: as the model becomes stronger through continual learning, it will encounter fewer difficult cases that need to be memorized, causing the growth of the non-parametric memory to slow down over time rather than growing at a linear rate with the size of training data; (2) model-wise scalability: SeMem allows a larger model to memorize fewer samples than its smaller counterpart because it is rarer for a larger model to encounter incomprehensible cases, resulting in a non-parametric memory that does not scale linearly with model size.

Continual Learning Language Modelling +1

Paper
Add Code

MB-DECTNet: A Model-Based Unrolled Network for Accurate 3D DECT Reconstruction

no code implementations • 1 Feb 2023 • Tao Ge, Maria Medrano, Rui Liao, David G. Politte, Jeffrey F. Williamson, Bruce R. Whiting, Joseph A. O'Sullivan

Therefore, to improve its convergence, we have embedded DECT SIR into a deep learning model-based unrolled network for 3D DECT reconstruction (MB-DECTNet) that can be trained in an end-to-end fashion.

Paper
Add Code

Pay Attention to Your Tone: Introducing a New Dataset for Polite Language Rewrite

no code implementations • 20 Dec 2022 • Xun Wang, Tao Ge, Allen Mao, Yuki Li, Furu Wei, Si-Qing Chen

We introduce \textsc{PoliteRewrite} -- a dataset for polite language rewrite which is a novel sentence rewrite task.

Sentence Style Transfer +1

Paper
Add Code

Extensible Prompts for Language Models on Zero-shot Language Style Customization

no code implementations • NeurIPS 2023 • Tao Ge, Jing Hu, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei

We propose eXtensible Prompt (X-Prompt) for prompting a large language model (LLM) beyond natural language (NL).

Descriptive Language Modelling +1

Paper
Add Code

DL-Corrector-Remapper: A grid-free bias-correction deep learning methodology for data-driven high-resolution global weather forecasting

no code implementations • 21 Oct 2022 • Tao Ge, Jaideep Pathak, Akshay Subramaniam, Karthik Kashinath

The improvement in DLCR's performance against the gold standard ground truth over the baseline's performance shows its potential to correct, remap, and fine-tune the mesh-gridded forecasts under the supervision of observations.

Weather Forecasting

Paper
Add Code

Lossless Acceleration for Seq2seq Generation with Aggressive Decoding

2 code implementations • 20 May 2022 • Tao Ge, Heming Xia, Xin Sun, Si-Qing Chen, Furu Wei

We study lossless acceleration for seq2seq generation with a novel decoding algorithm -- Aggressive Decoding.

Abstractive Text Summarization Grammatical Error Correction +4

18,327

Paper
Code

Text Revision by On-the-Fly Representation Optimization

1 code implementation • In2Writing (ACL) 2022 • Jingjing Li, Zichao Li, Tao Ge, Irwin King, Michael R. Lyu

In this approach, we simply fine-tune a pre-trained Transformer with masked language modeling and attribute classification.

Attribute Language Modelling +3

Paper
Code

Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation

2 code implementations • 30 Mar 2022 • Heming Xia, Tao Ge, Peiyi Wang, Si-Qing Chen, Furu Wei, Zhifang Sui

We propose Speculative Decoding (SpecDec), for the first time ever, to formally study exploiting the idea of speculative execution to accelerate autoregressive (AR) decoding.

Abstractive Text Summarization Machine Translation +1

Paper
Code

EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation

1 code implementation • 16 Feb 2022 • Tao Ge, Si-Qing Chen, Furu Wei

We introduce EdgeFormer -- a parameter-efficient Transformer for on-device seq2seq generation under the strict computation and memory constraints.

Grammatical Error Correction Knowledge Distillation +2

18,327

Paper
Code

A Metal Artifact Reduction Scheme For Accurate Iterative Dual-Energy CT Algorithms

no code implementations • 31 Jan 2022 • Tao Ge, Maria Medrano, Rui Liao, Jeffrey F. Williamson, David G. Politte, Bruce R. Whiting, Joseph A. O'Sullivan

We compared DEAM with the proposed method to the original DEAM and vendor reconstructions with and without metal-artifact reduction for orthopedic implants (O-MAR).

Metal Artifact Reduction

Paper
Add Code

A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model

no code implementations • 26 Jan 2022 • Xin Sun, Tao Ge, Shuming Ma, Jingjing Li, Furu Wei, Houfeng Wang

Synthetic data construction of Grammatical Error Correction (GEC) for non-English languages relies heavily on human-designed and language-specific rules, which produce limited error-corrected patterns.

Grammatical Error Correction Language Modelling +3

Paper
Add Code

Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression

1 code implementation • EMNLP 2021 • Canwen Xu, Wangchunshu Zhou, Tao Ge, Ke Xu, Julian McAuley, Furu Wei

Recent studies on compression of pretrained language models (e. g., BERT) usually use preserved accuracy as the metric for evaluation.

Knowledge Distillation Quantization

Paper
Code

A Machine-learning Based Initialization for Joint Statistical Iterative Dual-energy CT with Application to Proton Therapy

no code implementations • 30 Jul 2021 • Tao Ge, Maria Medrano, Rui Liao, David G. Politte, Jeffrey F. Williamson, Joseph A. O'Sullivan

Dual-energy CT (DECT) has been widely investigated to generate more informative and more accurate images in the past decades.

Paper
Add Code

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding

1 code implementation • ACL 2021 • Xin Sun, Tao Ge, Furu Wei, Houfeng Wang

In this paper, we propose Shallow Aggressive Decoding (SAD) to improve the online inference efficiency of the Transformer for instantaneous Grammatical Error Correction (GEC).

Grammatical Error Correction

Paper
Code

Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge

1 code implementation • NAACL 2021 • Canwen Xu, Wangchunshu Zhou, Tao Ge, Ke Xu, Julian McAuley, Furu Wei

Cant is important for understanding advertising, comedies and dog-whistle politics.

Common Sense Reasoning World Knowledge

Paper
Code

Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting

1 code implementation • EMNLP 2021 • Wangchunshu Zhou, Tao Ge, Canwen Xu, Ke Xu, Furu Wei

In this paper, we generalize text infilling (e. g., masked language models) by proposing Sequence Span Rewriting (SSR) as a self-supervised sequence-to-sequence (seq2seq) pre-training objective.

Sentence Text Infilling

Paper
Code

UnihanLM: Coarse-to-Fine Chinese-Japanese Language Model Pretraining with the Unihan Database

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Canwen Xu, Tao Ge, Chenliang Li, Furu Wei

Chinese and Japanese share many characters with similar surface morphology.

Language Modelling

Paper
Code

Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction

no code implementations • EMNLP 2020 • Mengyun Chen, Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou

We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection (ESD) and Erroneous Span Correction (ESC).

Grammatical Error Correction Sentence

Paper
Add Code

BERT Loses Patience: Fast and Robust Inference with Early Exit

1 code implementation • NeurIPS 2020 • Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, Furu Wei

In this paper, we propose Patience-based Early Exit, a straightforward yet effective inference method that can be used as a plug-and-play technique to simultaneously improve the efficiency and robustness of a pretrained language model (PLM).

Language Modelling

Paper
Code

Parallel Data Augmentation for Formality Style Transfer

1 code implementation • ACL 2020 • Yi Zhang, Tao Ge, Xu sun

The main barrier to progress in the task of Formality Style Transfer is the inadequacy of training data.

Data Augmentation Formality Style Transfer +2

Paper
Code

Scheduled DropHead: A Regularization Method for Transformer Models

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou

In this paper, we introduce DropHead, a structured dropout method specifically designed for regularizing the multi-head attention mechanism, which is a key component of transformer, a state-of-the-art model for various NLP tasks.

Machine Translation text-classification +2

Paper
Code

BERT-of-Theseus: Compressing BERT by Progressive Module Replacing

1 code implementation • EMNLP 2020 • Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou

Our approach first divides the original BERT into several modules and builds their compact substitutes.

Knowledge Distillation Model Compression

308

Paper
Code

Pseudo-Bidirectional Decoding for Local Sequence Transduction

no code implementations • Findings of the Association for Computational Linguistics 2020 • Wangchunshu Zhou, Tao Ge, Ke Xu

PBD copies the corresponding representation of source tokens to the decoder as pseudo future context to enable the decoder to attends to its bi-directional context.

Grammatical Error Correction Inductive Bias +1

Paper
Add Code

Self-Adversarial Learning with Comparative Discrimination for Text Generation

no code implementations • ICLR 2020 • Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou

Conventional Generative Adversarial Networks (GANs) for text generation tend to have issues of reward sparsity and mode collapse that affect the quality and diversity of generated samples.

Sentence Text Generation

Paper
Add Code

Fact-aware Sentence Split and Rephrase with Permutation Invariant Training

no code implementations • 16 Jan 2020 • Yinuo Guo, Tao Ge, Furu Wei

To overcome the challenges, we first propose the Fact-aware Sentence Encoding, which enables the model to learn facts from the long sentence and thus improves the precision of sentence split; then we introduce Permutation Invariant Training to alleviate the effects of order variance in seq2seq learning for this task.

Sentence Split and Rephrase

Paper
Add Code

Improving Grammatical Error Correction with Machine Translation Pairs

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Wangchunshu Zhou, Tao Ge, Chang Mu, Ke Xu, Furu Wei, Ming Zhou

The poor translation model resembles the ESL (English as a second language) learner and tends to generate translations of low quality in terms of fluency and grammatical correctness, while the good translation model generally generates fluent and grammatically correct translations.

Grammatical Error Correction Language Modelling +3

Paper
Code

Sequence-to-sequence Pre-training with Data Augmentation for Sentence Rewriting

no code implementations • 13 Sep 2019 • Yi Zhang, Tao Ge, Furu Wei, Ming Zhou, Xu sun

We study sequence-to-sequence (seq2seq) pre-training with data augmentation for sentence rewriting.

Data Augmentation Formality Style Transfer +4

Paper
Add Code

BERT-based Lexical Substitution

1 code implementation • ACL 2019 • Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou

Our approach first applies dropout to the target word{'}s embedding for partially masking the word, allowing BERT to take balanced consideration of the target word{'}s semantics and contexts for proposing substitute candidates, and then validates the candidates based on their substitution{'}s influence on the global contextualized representation of the sentence.

Sentence

Paper
Code

Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study

no code implementations • ACL 2019 • Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou

Sequence-to-sequence (seq2seq) models have achieved tremendous success in text generation tasks.

Formality Style Transfer Grammatical Error Correction +6

Paper
Add Code

Formality Style Transfer with Hybrid Textual Annotations

no code implementations • 15 Mar 2019 • Ruochen Xu, Tao Ge, Furu Wei

Its challenge is the lack of large-scale sentence-aligned parallel data.

Formality Style Transfer Sentence +2

Paper
Add Code

Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition

no code implementations • EMNLP 2018 • Tao Ge, Qing Dou, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Furu Wei, Ming Zhou

This paper proposes to study fine-grained coordinated cross-lingual text stream alignment through a novel information network decipherment paradigm.

Decipherment Information Retrieval

Paper
Add Code

Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study

1 code implementation • 3 Jul 2018 • Tao Ge, Furu Wei, Ming Zhou

Neural sequence-to-sequence (seq2seq) approaches have proven to be successful in grammatical error correction (GEC).

Ranked #1 on Grammatical Error Correction on Unrestricted

Grammatical Error Correction Sentence

Paper
Code

Fluency Boost Learning and Inference for Neural Grammatical Error Correction

no code implementations • ACL 2018 • Tao Ge, Furu Wei, Ming Zhou

Most of the neural sequence-to-sequence (seq2seq) models for grammatical error correction (GEC) have two limitations: (1) a seq2seq model may not be well generalized with only limited error-corrected data; (2) a seq2seq model may fail to completely correct a sentence with multiple errors through normal seq2seq inference.

Grammatical Error Correction Sentence

Paper
Add Code

EventWiki: A Knowledge Base of Major Events

no code implementations • LREC 2018 • Tao Ge, Lei Cui, Baobao Chang, Zhifang Sui, Furu Wei, Ming Zhou

Question Answering Semantic Parsing

Paper
Add Code

Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion

no code implementations • WS 2015 • Yue Liu, Tao Ge, Kusum S. Mathews, Heng Ji, Deborah L. McGuinness

In the medical domain, identifying and expanding abbreviations in clinical texts is a vital task for both better human and machine understanding.

Word Embeddings

Paper
Add Code

Event Detection with Burst Information Networks

no code implementations • COLING 2016 • Tao Ge, Lei Cui, Baobao Chang, Zhifang Sui, Ming Zhou

Retrospective event detection is an important task for discovering previously unidentified events in a text stream.

Clustering Event Detection

Paper
Add Code

Towards Time-Aware Knowledge Graph Completion

no code implementations • COLING 2016 • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Baobao Chang, Sujian Li, Zhifang Sui

In this paper, we present a novel time-aware knowledge graph completion model that is able to predict links in a KG using both the existing facts and the temporal information of the facts.

Question Answering Relation Extraction +1

Paper
Add Code

Encoding Temporal Information for Time-Aware Link Prediction

no code implementations • EMNLP 2016 • Tingsong Jiang, Tianyu Liu, Tao Ge, Lei Sha, Sujian Li, Baobao Chang, Zhifang Sui

Link Prediction

Paper
Add Code

News Stream Summarization using Burst Information Networks

no code implementations • EMNLP 2016 • Tao Ge, Lei Cui, Baobao Chang, Sujian Li, Ming Zhou, Zhifang Sui

Document Summarization Multi-Document Summarization

Paper
Add Code

Aligning Coordinated Text Streams through Burst Information Network Construction and Decipherment

no code implementations • 27 Sep 2016 • Tao Ge, Qing Dou, Xiaoman Pan, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Ming Zhou

We introduce a novel Burst Information Network (BINet) representation that can display the most important information and illustrate the connections among bursty entities, events and keywords in the corpus.

Decipherment Translation