Search Results for author: Junxian He

Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications.

Benchmarking

181

Paper
Code

GeoGalactica: A Scientific Large Language Model in Geoscience

1 code implementation • 31 Dec 2023 • Zhouhan Lin, Cheng Deng, Le Zhou, Tianhang Zhang, Yi Xu, Yutong Xu, Zhongmou He, Yuanyuan Shi, Beiya Dai, Yunchong Song, Boyi Zeng, Qiyuan Chen, Yuxun Miao, Bo Xue, Shu Wang, Luoyi Fu, Weinan Zhang, Junxian He, Yunqiang Zhu, Xinbing Wang, Chenghu Zhou

To our best knowledge, it is the largest language model for the geoscience domain.

Document Classification General Knowledge +4

Paper
Code

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

1 code implementation • 25 Dec 2023 • Wei Liu, Weihao Zeng, Keqing He, Yong Jiang, Junxian He

We present deita (short for Data-Efficient Instruction Tuning for Alignment), a series of models fine-tuned from LLaMA and Mistral models using data samples automatically selected with our proposed approach.

343

Paper
Code

A Survey of Reasoning with Foundation Models

1 code implementation • 17 Dec 2023 • Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, Zhenguo Li

Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.

Medical Diagnosis

343

Paper
Code

Prompt Optimization via Adversarial In-Context Learning

no code implementations • 5 Dec 2023 • Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He

We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier.

Arithmetic Reasoning Data-to-Text Generation +2

Paper
Add Code

InstructCoder: Instruction Tuning Large Language Models for Code Editing

1 code implementation • 31 Oct 2023 • Kaixin Li, Qisheng Hu, Xu Zhao, Hui Chen, Yuxi Xie, Tiedong Liu, Qizhe Xie, Junxian He

In this work, we explore the use of Large Language Models (LLMs) to edit code based on user instructions.

Paper
Code

FELM: Benchmarking Factuality Evaluation of Large Language Models

1 code implementation • NeurIPS 2023 • Shiqi Chen, Yiran Zhao, Jinghan Zhang, I-Chun Chern, Siyang Gao, PengFei Liu, Junxian He

In this benchmark, we collect responses generated from LLMs and annotate factuality labels in a fine-grained manner.

Benchmarking Math +2

Paper
Code

SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning

2 code implementations • 3 Aug 2023 • Keyu Duan, Qian Liu, Tat-Seng Chua, Shuicheng Yan, Wei Tsang Ooi, Qizhe Xie, Junxian He

More recently, with the rapid development of language models (LMs), researchers have focused on leveraging LMs to facilitate the learning of TGs, either by jointly training them in a computationally intensive framework (merging the two stages), or designing complex self-supervised training tasks for feature extraction (enhancing the first stage).

Ranked #1 on Node Property Prediction on ogbn-arxiv

Feature Engineering Graph Learning +3

Paper
Code

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

4 code implementations • 25 Jul 2023 • I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, PengFei Liu

With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e. g., ChatGPT).

Code Generation Fact Checking +1

762

Paper
Code

Composing Parameter-Efficient Modules with Arithmetic Operations

2 code implementations • 26 Jun 2023 • Jinghan Zhang, Shiqi Chen, Junteng Liu, Junxian He

In this paper, we propose to compose these parameter-efficient modules through linear arithmetic operations in the weight space, thereby integrating different module capabilities.

Language Modelling Large Language Model +1

Paper
Code

Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs

1 code implementation • 22 Jun 2023 • Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi

To better break down the problem, we define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.

Arithmetic Reasoning Benchmarking +1

Paper
Code

K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization

1 code implementation • 8 Jun 2023 • Cheng Deng, Tianhang Zhang, Zhongmou He, Yi Xu, Qiyuan Chen, Yuanyuan Shi, Luoyi Fu, Weinan Zhang, Xinbing Wang, Chenghu Zhou, Zhouhan Lin, Junxian He

Large language models (LLMs) have achieved great success in general domains of natural language processing.

Language Modelling

141

Paper
Code

Contrastive Learning of Sentence Embeddings from Scratch

2 code implementations • 24 May 2023 • Junlei Zhang, Zhenzhong Lan, Junxian He

Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings.

Contrastive Learning Natural Language Inference +3

Paper
Code

Evaluating Factual Consistency of Summaries with Large Language Models

2 code implementations • 23 May 2023 • Shiqi Chen, Siyang Gao, Junxian He

Detecting factual errors in summaries has been an important and challenging subject in summarization research.

Binary Classification Sentence

Paper
Code

CodeInstruct: Empowering Language Models to Edit Code

1 code implementation • Github 2023 • Qisheng Hu*, Kaixin Li*, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie, Junxian He

In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring.

Paper
Code

Automatic Model Selection with Large Language Models for Reasoning

1 code implementation • 23 May 2023 • James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie

Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths.

Ranked #1 on Math Word Problem Solving on SVAMP

Arithmetic Reasoning GSM8K +4

Paper
Code

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

1 code implementation • NeurIPS 2023 • Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, Junxian He

We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context.

Multiple-choice

1,471

Paper
Code

Self-Evaluation Guided Beam Search for Reasoning

no code implementations • NeurIPS 2023 • Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie

Stochastic beam search balances exploitation and exploration of the search space with temperature-controlled randomness.

Arithmetic Reasoning GSM8K +3

Paper
Add Code

Mega: Moving Average Equipped Gated Attention

5 code implementations • 21 Sep 2022 • Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer

The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.

Ranked #1 on Long-range modeling on LRA

Image Classification Inductive Bias +3

124,984

Paper
Code

Non-Parametric Temporal Adaptation for Social Media Topic Classification

no code implementations • 13 Sep 2022 • FatemehSadat Mireshghallah, Nikolai Vogler, Junxian He, Omar Florez, Ahmed El-Kishky, Taylor Berg-Kirkpatrick

User-generated social media data is constantly changing as new trends influence online discussion and personal information is deleted due to privacy concerns.

Classification Retrieval +1

Paper
Add Code

Prompt Consistency for Zero-Shot Task Generalization

1 code implementation • 29 Apr 2022 • Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig

One of the most impressive results of recent NLP history is the ability of pre-trained language models to solve new tasks in a zero-shot setting.

Paper
Code

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

2 code implementations • 28 Jan 2022 • Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig

Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.

Language Modelling Retrieval

262

Paper
Code

Towards a Unified View of Parameter-Efficient Transfer Learning

1 code implementation • ICLR 2022 • Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig

Furthermore, our unified framework enables the transfer of design elements across different approaches, and as a result we are able to instantiate new parameter-efficient fine-tuning methods that tune less parameters than previous methods while being more effective, achieving comparable results to fine-tuning all parameters on all four tasks.

Machine Translation text-classification +3

486

Paper
Code

Capturing Structural Locality in Non-parametric Language Models

no code implementations • ICLR 2022 • Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn

Structural locality is a ubiquitous feature of real-world datasets, wherein data points are organized into local hierarchies.

Paper
Add Code

Dependency Induction Through the Lens of Visual Perception

1 code implementation • CoNLL (EMNLP) 2021 • Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk, Graham Neubig

Our experiments find that concreteness is a strong indicator for learning dependency grammars, improving the direct attachment score (DAS) by over 50\% as compared to state-of-the-art models trained on pure text.

Constituency Grammar Induction Dependency Parsing

Paper
Code

Efficient Nearest Neighbor Language Models

2 code implementations • EMNLP 2021 • Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick

Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore, which allows them to learn through explicitly memorizing the training datapoints.

Domain Adaptation Language Modelling +1

Paper
Code

CTRLsum: Towards Generic Controllable Text Summarization

1 code implementation • 8 Dec 2020 • Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong

Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts.

Descriptive Reading Comprehension +1

145

Paper
Code

On the Sentence Embeddings from Pre-trained Language Models

3 code implementations • EMNLP 2020 • Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei LI

Pre-trained contextual representations like BERT have achieved great success in natural language processing.

Ranked #16 on Semantic Textual Similarity on STS16

Language Modelling Semantic Similarity +4

652

Paper
Code

Learning Sparse Prototypes for Text Generation

1 code implementation • NeurIPS 2020 • Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig

While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus.

Language Modelling Prototype Selection +4

Paper
Code

A Probabilistic Formulation of Unsupervised Text Style Transfer

5 code implementations • ICLR 2020 • Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick

Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes.

Decipherment Language Modelling +6

222

Paper
Code

Revisiting Self-Training for Neural Sequence Generation

1 code implementation • ICLR 2020 • Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato

In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks.

Machine Translation Text Summarization +1

Paper
Code

The Source-Target Domain Mismatch Problem in Machine Translation

no code implementations • EACL 2021 • Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato

While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.

Machine Translation Translation

Paper
Add Code

A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text

1 code implementation • IJCNLP 2019 • Bohan Li, Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick, Yiming Yang

In this paper, we investigate a simple fix for posterior collapse which yields surprisingly effective results.

Language Modelling Representation Learning

Paper
Code

Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections

1 code implementation • ACL 2019 • Junxian He, Zhisong Zhang, Taylor Berg-Kirkpatrick, Graham Neubig

The parameters of source model and target model are softly shared through a regularized log likelihood objective.

Cross-Lingual Transfer Dependency Parsing +3

Paper
Code

Choosing Transfer Languages for Cross-Lingual Learning

1 code implementation • ACL 2019 • Yu-Hsiang Lin, Chian-Yu Chen, Jean Lee, Zirui Li, Yuyan Zhang, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, Patrick Littell, Graham Neubig

Cross-lingual transfer, where a high-resource transfer language is used to improve the accuracy of a low-resource task language, is now an invaluable tool for improving performance of natural language processing (NLP) on low-resource languages.

Cross-Lingual Transfer

Paper
Code

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

2 code implementations • ICLR 2019 • Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick

The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique.

Ranked #1 on Text Generation on Yahoo Questions

Text Generation

183

Paper
Code

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

4 code implementations • ACL 2019 • Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wangrong Zhu, Devendra Singh Sachan, Eric P. Xing

The versatile toolkit also fosters technique sharing across different text generation tasks.

Machine Translation Text Generation +1

2,383

Paper
Code

Unsupervised Learning of Syntactic Structure with Invertible Neural Projections

1 code implementation • EMNLP 2018 • Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick

In this work, we propose a novel generative model that jointly learns discrete syntactic structure and continuous word representations in an unsupervised fashion by cascading an invertible neural network with a structured generative prior.

Ranked #14 on Constituency Grammar Induction on PTB Diagnostic ECG Database

Constituency Grammar Induction POS +1

Paper
Code

Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation

no code implementations • WS 2018 • Zhiting Hu, Zichao Yang, Tiancheng Zhao, Haoran Shi, Junxian He, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Lianhui Qin, Devendra Singh Chaplot, Bowen Tan, Xingjiang Yu, Eric Xing

The features make Texar particularly suitable for technique sharing and generalization across different text generation applications.

Image Captioning Machine Translation +3

Paper
Add Code

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing

7 code implementations • ACL 2018 • Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig

Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.

Code Generation Semantic Parsing

460

Paper
Code

Efficient Correlated Topic Modeling with Topic Embedding

no code implementations • 1 Jul 2017 • Junxian He, Zhiting Hu, Taylor Berg-Kirkpatrick, Ying Huang, Eric P. Xing

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling.

Document Classification General Classification +2

Paper
Add Code

Text Network Exploration via Heterogeneous Web of Topics

no code implementations • 2 Oct 2016 • Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang

A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.