no code implementations • 5 Dec 2023 • Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He
We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier.
1 code implementation • 31 Oct 2023 • Qisheng Hu, Kaixin Li, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie, Junxian He
In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring.
1 code implementation • NeurIPS 2023 • Shiqi Chen, Yiran Zhao, Jinghan Zhang, I-Chun Chern, Siyang Gao, PengFei Liu, Junxian He
In this benchmark, we collect responses generated from LLMs and annotate factuality labels in a fine-grained manner.
2 code implementations • 3 Aug 2023 • Keyu Duan, Qian Liu, Tat-Seng Chua, Shuicheng Yan, Wei Tsang Ooi, Qizhe Xie, Junxian He
More recently, with the rapid development of language models (LMs), researchers have focused on leveraging LMs to facilitate the learning of TGs, either by jointly training them in a computationally intensive framework (merging the two stages), or designing complex self-supervised training tasks for feature extraction (enhancing the first stage).
Ranked #1 on
Node Property Prediction
on ogbn-arxiv
1 code implementation • 25 Jul 2023 • I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, PengFei Liu
With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e. g., ChatGPT).
2 code implementations • 26 Jun 2023 • Jinghan Zhang, Shiqi Chen, Junteng Liu, Junxian He
In this paper, we propose to compose these parameter-efficient modules through linear arithmetic operations in the weight space, thereby integrating different module capabilities.
no code implementations • 22 Jun 2023 • Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi
The task of empowering large language models (LLMs) to accurately express their confidence, referred to as confidence elicitation, is essential in ensuring reliable and trustworthy decision-making processes.
1 code implementation • 8 Jun 2023 • Cheng Deng, Tianhang Zhang, Zhongmou He, Yi Xu, Qiyuan Chen, Yuanyuan Shi, Luoyi Fu, Weinan Zhang, Xinbing Wang, Chenghu Zhou, Zhouhan Lin, Junxian He
Large language models (LLMs) have achieved great success in general domains of natural language processing.
2 code implementations • 24 May 2023 • Junlei Zhang, Zhenzhong Lan, Junxian He
Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings.
1 code implementation • 23 May 2023 • James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie
Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths.
Ranked #1 on
Math Word Problem Solving
on SVAMP
1 code implementation • Github 2023 • Qisheng Hu*, Kaixin Li*, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie, Junxian He
In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring.
2 code implementations • 23 May 2023 • Shiqi Chen, Siyang Gao, Junxian He
Detecting factual errors in summaries has been an important and challenging subject in summarization research.
1 code implementation • NeurIPS 2023 • Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, Junxian He
We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context.
5 code implementations • 21 Sep 2022 • Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer
The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.
Ranked #1 on
Long-range modeling
on LRA
no code implementations • 13 Sep 2022 • FatemehSadat Mireshghallah, Nikolai Vogler, Junxian He, Omar Florez, Ahmed El-Kishky, Taylor Berg-Kirkpatrick
User-generated social media data is constantly changing as new trends influence online discussion and personal information is deleted due to privacy concerns.
1 code implementation • 29 Apr 2022 • Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig
One of the most impressive results of recent NLP history is the ability of pre-trained language models to solve new tasks in a zero-shot setting.
2 code implementations • 28 Jan 2022 • Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.
3 code implementations • ICLR 2022 • Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig
Furthermore, our unified framework enables the transfer of design elements across different approaches, and as a result we are able to instantiate new parameter-efficient fine-tuning methods that tune less parameters than previous methods while being more effective, achieving comparable results to fine-tuning all parameters on all four tasks.
no code implementations • ICLR 2022 • Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn
Structural locality is a ubiquitous feature of real-world datasets, wherein data points are organized into local hierarchies.
1 code implementation • CoNLL (EMNLP) 2021 • Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk, Graham Neubig
Our experiments find that concreteness is a strong indicator for learning dependency grammars, improving the direct attachment score (DAS) by over 50\% as compared to state-of-the-art models trained on pure text.
2 code implementations • EMNLP 2021 • Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore, which allows them to learn through explicitly memorizing the training datapoints.
1 code implementation • 8 Dec 2020 • Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong
Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts.
3 code implementations • EMNLP 2020 • Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei LI
Pre-trained contextual representations like BERT have achieved great success in natural language processing.
Ranked #16 on
Semantic Textual Similarity
on STS16
1 code implementation • NeurIPS 2020 • Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig
While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus.
5 code implementations • ICLR 2020 • Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick
Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes.
1 code implementation • ICLR 2020 • Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato
In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks.
no code implementations • EACL 2021 • Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato
While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.
1 code implementation • IJCNLP 2019 • Bohan Li, Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick, Yiming Yang
In this paper, we investigate a simple fix for posterior collapse which yields surprisingly effective results.
1 code implementation • ACL 2019 • Junxian He, Zhisong Zhang, Taylor Berg-Kirkpatrick, Graham Neubig
The parameters of source model and target model are softly shared through a regularized log likelihood objective.
1 code implementation • ACL 2019 • Yu-Hsiang Lin, Chian-Yu Chen, Jean Lee, Zirui Li, Yuyan Zhang, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, Patrick Littell, Graham Neubig
Cross-lingual transfer, where a high-resource transfer language is used to improve the accuracy of a low-resource task language, is now an invaluable tool for improving performance of natural language processing (NLP) on low-resource languages.
2 code implementations • ICLR 2019 • Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick
The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique.
Ranked #1 on
Text Generation
on Yahoo Questions
3 code implementations • ACL 2019 • Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wangrong Zhu, Devendra Singh Sachan, Eric P. Xing
The versatile toolkit also fosters technique sharing across different text generation tasks.
1 code implementation • EMNLP 2018 • Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick
In this work, we propose a novel generative model that jointly learns discrete syntactic structure and continuous word representations in an unsupervised fashion by cascading an invertible neural network with a structured generative prior.
no code implementations • WS 2018 • Zhiting Hu, Zichao Yang, Tiancheng Zhao, Haoran Shi, Junxian He, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Lianhui Qin, Devendra Singh Chaplot, Bowen Tan, Xingjiang Yu, Eric Xing
The features make Texar particularly suitable for technique sharing and generalization across different text generation applications.
7 code implementations • ACL 2018 • Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig
Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.
no code implementations • 1 Jul 2017 • Junxian He, Zhiting Hu, Taylor Berg-Kirkpatrick, Ying Huang, Eric P. Xing
Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling.
no code implementations • 2 Oct 2016 • Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang
A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges.