Search Results for author: Tianxing He

Found 28 papers, 19 papers with code

Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks

1 code implementation • 18 Feb 2024 • Yichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu, Chao Shen, Xiaoming Liu, Yulia Tsvetkov, Tianxing He

Our experiments reveal that almost none of the existing detectors remain robust under all the attacks, and all detectors exhibit different loopholes.

Paper
Code

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

1 code implementation • 17 Feb 2024 • Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He

Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection.

Text Detection Text Generation

Paper
Code

Learning Time-Invariant Representations for Individual Neurons from Population Dynamics

1 code implementation • NeurIPS 2023 • Lu Mi, Trung Le, Tianxing He, Eli Shlizerman, Uygar Sümbül

This suggests that neuronal activity is a combination of its time-invariant identity and the inputs the neuron receives from the rest of the circuit.

Self-Supervised Learning

Paper
Code

KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models

1 code implementation • 15 Oct 2023 • Yuyang Bai, Shangbin Feng, Vidhisha Balachandran, Zhaoxuan Tan, Shiqi Lou, Tianxing He, Yulia Tsvetkov

To gain a better understanding of LLMs' knowledge abilities and their generalization, we evaluate 10 open-source and black-box LLMs on the KGQuiz benchmark across the five knowledge-intensive tasks and knowledge domains.

Multiple-choice World Knowledge

Paper
Code

On the Zero-Shot Generalization of Machine-Generated Text Detectors

no code implementations • 8 Oct 2023 • Xiao Pu, Jingyu Zhang, Xiaochuang Han, Yulia Tsvetkov, Tianxing He

The rampant proliferation of large language models, fluent enough to generate text indistinguishable from human-written language, gives unprecedented importance to the detection of machine-generated text.

Zero-shot Generalization

Paper
Add Code

SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

1 code implementation • 6 Oct 2023 • Abe Bohan Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov

Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design.

Sentence Text Generation

Paper
Code

Resolving Knowledge Conflicts in Large Language Models

1 code implementation • 2 Oct 2023 • Yike Wang, Shangbin Feng, Heng Wang, Weijia Shi, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

To this end, we introduce KNOWLEDGE CONFLICT, an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals.

Paper
Code

Knowledge Crosswords: Geometric Reasoning over Structured Knowledge with Large Language Models

1 code implementation • 2 Oct 2023 • Wenxuan Ding, Shangbin Feng, YuHan Liu, Zhaoxuan Tan, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

We additionally propose two new approaches, Staged Prompting and Verify-All, to augment LLMs' ability to backtrack and verify structured constraints.

Paper
Code

LatticeGen: A Cooperative Framework which Hides Generated Text in a Lattice for Privacy-Aware Generation on Cloud

no code implementations • 29 Sep 2023 • Mengke Zhang, Tianxing He, Tianle Wang, Lu Mi, FatemehSadat Mireshghallah, Binyi Chen, Hao Wang, Yulia Tsvetkov

In the current user-server interaction paradigm of prompted generation with large language models (LLM) on cloud, the server fully controls the generation process, which leaves zero options for users who want to keep the generated text to themselves.

Paper
Add Code

Can Language Models Solve Graph Problems in Natural Language?

2 code implementations • NeurIPS 2023 • Heng Wang, Shangbin Feng, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, Yulia Tsvetkov

We then propose Build-a-Graph Prompting and Algorithmic Prompting, two instruction-based approaches to enhance LLMs in solving natural language graph problems.

In-Context Learning Knowledge Probing +2

Paper
Code

Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models

2 code implementations • 17 May 2023 • Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

Ultimately, Knowledge Card framework enables dynamic synthesis and updates of knowledge from diverse domains.

Retrieval

Paper
Code

On the Blind Spots of Model-Based Evaluation Metrics for Text Generation

1 code implementation • 20 Dec 2022 • Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov

In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data.

Text Generation

Paper
Code

PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation

1 code implementation • 14 Oct 2022 • Jingyu Zhang, James Glass, Tianxing He

Existing work on controlled text generation (CTG) assumes a control interface of categorical attributes.

Attribute Text Generation

Paper
Code

Controlling the Focus of Pretrained Language Generation Models

1 code implementation • Findings (ACL) 2022 • Jiabao Ji, Yoon Kim, James Glass, Tianxing He

This work aims to develop a control mechanism by which a user can select spans of context as "highlights" for the model to focus on, and generate relevant output.

Abstractive Text Summarization Response Generation +1

Paper
Code

Revisiting Latent-Space Interpolation via a Quantitative Evaluation Framework

1 code implementation • 13 Oct 2021 • Lu Mi, Tianxing He, Core Francisco Park, Hao Wang, Yue Wang, Nir Shavit

In this work, we show how data labeled with semantically continuous attributes can be utilized to conduct a quantitative evaluation of latent-space interpolation algorithms, for variational autoencoders.

341

Paper
Code

An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models

1 code implementation • 6 Sep 2021 • Tianxing He, Kyunghyun Cho, James Glass

Prompt-based knowledge probing for 1-hop relations has been used to measure how much world knowledge is stored in pretrained language models.

Knowledge Probing Prompt Engineering +1

Paper
Code

Analyzing the Forgetting Problem in Pretrain-Finetuning of Open-domain Dialogue Response Models

no code implementations • EACL 2021 • Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent.

Response Generation Text Generation +1

Paper
Add Code

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

1 code implementation • EACL 2021 • Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e. g., Roberta) for natural language understanding (NLU) tasks.

Language Modelling Natural Language Understanding

Paper
Code

Quantifying Exposure Bias for Open-ended Language Generation

no code implementations • 28 Sep 2020 • Tianxing He, Jingzhao Zhang, Zhiming Zhou, James R. Glass

The exposure bias problem refers to the incrementally distorted generation induced by the training-generation discrepancy, in teacher-forcing training for auto-regressive neural network language models (LM).

Text Generation

Paper
Add Code

A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Moin Nadeem, Tianxing He, Kyunghyun Cho, James Glass

On the other hand, we find that the set of sampling algorithms that satisfies these properties performs on par with the existing sampling algorithms.

Text Generation

Paper
Code

AutoKG: Constructing Virtual Knowledge Graphs from Unstructured Documents for Question Answering

no code implementations • 20 Aug 2020 • Seunghak Yu, Tianxing He, James Glass

Knowledge graphs (KGs) have the advantage of providing fine-grained detail for question-answering systems.

Knowledge Graphs Language Modelling +2

Paper
Add Code

Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models

no code implementations • 16 Oct 2019 • Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent.

Response Generation Text Generation +1

Paper
Add Code

From Data Quality to Model Quality: an Exploratory Study on Deep Learning

no code implementations • 10 Jun 2019 • Tianxing He, Shengcheng Yu, Ziyuan Wang, Jieqiong Li, Zhenyu Chen

Nowadays, people strive to improve the accuracy of deep learning models.

Paper
Add Code

Why gradient clipping accelerates training: A theoretical justification for adaptivity

1 code implementation • ICLR 2020 • Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie

We provide a theoretical explanation for the effectiveness of gradient clipping in training deep neural networks.

General Classification Image Classification +1

Paper
Code

Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?

no code implementations • EMNLP 2021 • Tianxing He, Jingzhao Zhang, Zhiming Zhou, James Glass

Exposure bias has been regarded as a central problem for auto-regressive language models (LM).

Machine Translation Text Generation

Paper
Add Code

Negative Training for Neural Dialogue Response Generation

1 code implementation • ACL 2020 • Tianxing He, James Glass

Although deep learning models have brought tremendous advancements to the field of open-domain dialogue response generation, recent research results have revealed that the trained models have undesirable generation behaviors, such as malicious responses and generic (boring) responses.

Response Generation

Paper
Code

Detecting egregious responses in neural sequence-to-sequence models

no code implementations • ICLR 2019 • Tianxing He, James Glass

We adopt an empirical methodology, in which we first create lists of egregious output sequences, and then design a discrete optimization algorithm to find input sequences that will cause the model to generate them.

Response Generation

Paper
Add Code

On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation

1 code implementation • 19 Feb 2016 • Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu

We propose to train bi-directional neural network language model(NNLM) with noise contrastive estimation(NCE).

Language Modelling

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.