Search Results for author: Junxian He

Found 45 papers, 37 papers with code

Compression Represents Intelligence Linearly

1 code implementation15 Apr 2024 Yuzhen Huang, Jinghan Zhang, Zifei Shan, Junxian He

We open-source our compression datasets as well as our data collection pipelines to facilitate future researchers to assess compression properly.

Language Modelling Mathematical Reasoning

In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

1 code implementation3 Mar 2024 Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He

Large language models (LLMs) frequently hallucinate and produce factual errors, yet our understanding of why they make these errors remains limited.

Hallucination

AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents

2 code implementations24 Jan 2024 Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He

Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications.

Benchmarking

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

1 code implementation25 Dec 2023 Wei Liu, Weihao Zeng, Keqing He, Yong Jiang, Junxian He

We present deita (short for Data-Efficient Instruction Tuning for Alignment), a series of models fine-tuned from LLaMA and Mistral models using data samples automatically selected with our proposed approach.

Prompt Optimization via Adversarial In-Context Learning

no code implementations5 Dec 2023 Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He

We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier.

Arithmetic Reasoning Data-to-Text Generation +2

InstructCoder: Instruction Tuning Large Language Models for Code Editing

1 code implementation31 Oct 2023 Kaixin Li, Qisheng Hu, Xu Zhao, Hui Chen, Yuxi Xie, Tiedong Liu, Qizhe Xie, Junxian He

In this work, we explore the use of Large Language Models (LLMs) to edit code based on user instructions.

SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning

2 code implementations3 Aug 2023 Keyu Duan, Qian Liu, Tat-Seng Chua, Shuicheng Yan, Wei Tsang Ooi, Qizhe Xie, Junxian He

More recently, with the rapid development of language models (LMs), researchers have focused on leveraging LMs to facilitate the learning of TGs, either by jointly training them in a computationally intensive framework (merging the two stages), or designing complex self-supervised training tasks for feature extraction (enhancing the first stage).

Feature Engineering Graph Learning +3

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

4 code implementations25 Jul 2023 I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, PengFei Liu

With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e. g., ChatGPT).

Code Generation Fact Checking +1

Composing Parameter-Efficient Modules with Arithmetic Operations

2 code implementations26 Jun 2023 Jinghan Zhang, Shiqi Chen, Junteng Liu, Junxian He

In this paper, we propose to compose these parameter-efficient modules through linear arithmetic operations in the weight space, thereby integrating different module capabilities.

Language Modelling Large Language Model +1

Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs

1 code implementation22 Jun 2023 Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi

To better break down the problem, we define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.

Arithmetic Reasoning Benchmarking +1

Contrastive Learning of Sentence Embeddings from Scratch

2 code implementations24 May 2023 Junlei Zhang, Zhenzhong Lan, Junxian He

Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings.

Contrastive Learning Natural Language Inference +3

Evaluating Factual Consistency of Summaries with Large Language Models

2 code implementations23 May 2023 Shiqi Chen, Siyang Gao, Junxian He

Detecting factual errors in summaries has been an important and challenging subject in summarization research.

Binary Classification Sentence

CodeInstruct: Empowering Language Models to Edit Code

1 code implementation Github 2023 Qisheng Hu*, Kaixin Li*, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie, Junxian He

In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring.

Automatic Model Selection with Large Language Models for Reasoning

1 code implementation23 May 2023 James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie

Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths.

Arithmetic Reasoning GSM8K +4

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

1 code implementation NeurIPS 2023 Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, Junxian He

We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context.

Multiple-choice

Self-Evaluation Guided Beam Search for Reasoning

no code implementations NeurIPS 2023 Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie

Stochastic beam search balances exploitation and exploration of the search space with temperature-controlled randomness.

Arithmetic Reasoning GSM8K +3

Mega: Moving Average Equipped Gated Attention

5 code implementations21 Sep 2022 Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer

The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.

Image Classification Inductive Bias +3

Non-Parametric Temporal Adaptation for Social Media Topic Classification

no code implementations13 Sep 2022 FatemehSadat Mireshghallah, Nikolai Vogler, Junxian He, Omar Florez, Ahmed El-Kishky, Taylor Berg-Kirkpatrick

User-generated social media data is constantly changing as new trends influence online discussion and personal information is deleted due to privacy concerns.

Classification Retrieval +1

Prompt Consistency for Zero-Shot Task Generalization

1 code implementation29 Apr 2022 Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig

One of the most impressive results of recent NLP history is the ability of pre-trained language models to solve new tasks in a zero-shot setting.

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

2 code implementations28 Jan 2022 Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig

Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.

Language Modelling Retrieval

Towards a Unified View of Parameter-Efficient Transfer Learning

1 code implementation ICLR 2022 Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig

Furthermore, our unified framework enables the transfer of design elements across different approaches, and as a result we are able to instantiate new parameter-efficient fine-tuning methods that tune less parameters than previous methods while being more effective, achieving comparable results to fine-tuning all parameters on all four tasks.

Machine Translation text-classification +3

Capturing Structural Locality in Non-parametric Language Models

no code implementations ICLR 2022 Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn

Structural locality is a ubiquitous feature of real-world datasets, wherein data points are organized into local hierarchies.

Dependency Induction Through the Lens of Visual Perception

1 code implementation CoNLL (EMNLP) 2021 Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk, Graham Neubig

Our experiments find that concreteness is a strong indicator for learning dependency grammars, improving the direct attachment score (DAS) by over 50\% as compared to state-of-the-art models trained on pure text.

Constituency Grammar Induction Dependency Parsing

Efficient Nearest Neighbor Language Models

2 code implementations EMNLP 2021 Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick

Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore, which allows them to learn through explicitly memorizing the training datapoints.

Domain Adaptation Language Modelling +1

CTRLsum: Towards Generic Controllable Text Summarization

1 code implementation8 Dec 2020 Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong

Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts.

Descriptive Reading Comprehension +1

Learning Sparse Prototypes for Text Generation

1 code implementation NeurIPS 2020 Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig

While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus.

Language Modelling Prototype Selection +4

A Probabilistic Formulation of Unsupervised Text Style Transfer

5 code implementations ICLR 2020 Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick

Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes.

Decipherment Language Modelling +6

Revisiting Self-Training for Neural Sequence Generation

1 code implementation ICLR 2020 Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato

In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks.

Machine Translation Text Summarization +1

The Source-Target Domain Mismatch Problem in Machine Translation

no code implementations EACL 2021 Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato

While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.

Machine Translation Translation

Choosing Transfer Languages for Cross-Lingual Learning

1 code implementation ACL 2019 Yu-Hsiang Lin, Chian-Yu Chen, Jean Lee, Zirui Li, Yuyan Zhang, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, Patrick Littell, Graham Neubig

Cross-lingual transfer, where a high-resource transfer language is used to improve the accuracy of a low-resource task language, is now an invaluable tool for improving performance of natural language processing (NLP) on low-resource languages.

Cross-Lingual Transfer

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

2 code implementations ICLR 2019 Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick

The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique.

Text Generation

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

1 code implementation EMNLP 2018 Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick

In this work, we propose a novel generative model that jointly learns discrete syntactic structure and continuous word representations in an unsupervised fashion by cascading an invertible neural network with a structured generative prior.

Constituency Grammar Induction POS +1

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing

7 code implementations ACL 2018 Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig

Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.

Code Generation Semantic Parsing

Efficient Correlated Topic Modeling with Topic Embedding

no code implementations1 Jul 2017 Junxian He, Zhiting Hu, Taylor Berg-Kirkpatrick, Ying Huang, Eric P. Xing

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling.

Document Classification General Classification +2

Text Network Exploration via Heterogeneous Web of Topics

no code implementations2 Oct 2016 Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang

A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges.

Cannot find the paper you are looking for? You can Submit a new open access paper.