no code implementations • 14 Apr 2025 • Junlei Zhang, Zichen Ding, Chang Ma, Zijie Chen, Qiushi Sun, Zhenzhong Lan, Junxian He
For instance, multimodal mathematical reasoning enhances performance on AndroidWorld by an absolute 6. 3%.
1 code implementation • 11 Apr 2025 • Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Qiushi Sun, Kanzhi Cheng, Junxian He, Jun Liu, Zhiyong Wu
This motivates us to enhance LLM reasoning without the need for external supervision.
1 code implementation • 27 Mar 2025 • Xiaoye Qu, Yafu Li, Zhaochen Su, Weigao Sun, Jianhao Yan, Dongrui Liu, Ganqu Cui, Daizong Liu, Shuxian Liang, Junxian He, Peng Li, Wei Wei, Jing Shao, Chaochao Lu, Yue Zhang, Xian-Sheng Hua, BoWen Zhou, Yu Cheng
Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference.
1 code implementation • 24 Mar 2025 • Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He
DeepSeek-R1 has shown that long chain-of-thought (CoT) reasoning can naturally emerge through a simple reinforcement learning (RL) framework with rule-based rewards, where the training may directly start from the base models-a paradigm referred to as zero RL training.
1 code implementation • 24 Mar 2025 • Junteng Liu, Weihao Zeng, Xiwen Zhang, Yijun Wang, Zifei Shan, Junxian He
Chart understanding requires models to effectively analyze and reason about numerical data, textual elements, and complex visual components.
no code implementations • 14 Mar 2025 • Bryan Wilie, Samuel Cahyawijaya, Junxian He, Pascale Fung
We utilize ILO to investigate the impact of single-language fine-tuning on the interlingual representations in multilingual LLMs.
1 code implementation • 3 Mar 2025 • Shiqi Chen, Tongyao Zhu, Ruochen Zhou, Jinghan Zhang, Siyang Gao, Juan Carlos Niebles, Mor Geva, Junxian He, Jiajun Wu, Manling Li
By tracing attention distribution over the image through out intermediate layers, we observe that successful spatial reasoning correlates strongly with the model's ability to align its attention distribution with actual object locations, particularly differing between familiar and unfamiliar spatial relationships.
1 code implementation • 2 Mar 2025 • Kashun Shum, Yuzhen Huang, Hongjian Zou, Ding Qi, Yixuan Liao, Xiaoxin Chen, Qian Liu, Junxian He
Through comprehensive experiments with 1B and 3B parameter models, we demonstrate that models trained on 30B tokens selected with PreSelect surpasses the performance of a vanilla baseline trained on 300B tokens, achieving a 10x reduction in compute requirements.
1 code implementation • 11 Feb 2025 • Junlong Li, Daya Guo, Dejian Yang, Runxin Xu, Yu Wu, Junxian He
Reasoning is a fundamental capability of Large Language Models.
no code implementations • 27 Dec 2024 • Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu
To address these challenges, we propose OS-Genesis, a novel GUI data synthesis pipeline that reverses the conventional trajectory collection process.
1 code implementation • 23 Dec 2024 • Weihao Zeng, Yuzhen Huang, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He
In the absence of extensive human-annotated data for complex reasoning tasks, self-improvement -- where models are trained on their own outputs -- has emerged as a primary method for enhancing performance.
no code implementations • 23 Dec 2024 • Wei Liu, Junlong Li, Xiwen Zhang, Fan Zhou, Yu Cheng, Junxian He
Our analysis leads to a set of best practices for each factor, aimed at optimizing multimodal reasoning.
1 code implementation • 22 Oct 2024 • Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong
Large Language Models have demonstrated remarkable abilities in reasoning and planning by breaking down complex problems into sequential steps.
1 code implementation • 11 Jul 2024 • Junteng Liu, Shiqi Chen, Yu Cheng, Junxian He
In this work, we investigate whether a universal truthfulness hyperplane that distinguishes the model's factually correct and incorrect outputs exists within the model.
1 code implementation • 28 Jun 2024 • Bryan Wilie, Samuel Cahyawijaya, Etsuko Ishii, Junxian He, Pascale Fung
We evaluate $\sim$30 LMs across diverse prompting strategies and found that LMs generally struggle to appropriately revise their beliefs in response to new information.
1 code implementation • 18 Jun 2024 • Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He
Solving mathematical problems requires advanced reasoning abilities and presents notable challenges for large language models.
Ranked #4 on
Natural Questions
on TheoremQA
(using extra training data)
2 code implementations • 14 Jun 2024 • Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Xin Liu, Changlong Yu, Zheng Li, Chen Luo, Qingyu Yin, Bing Yin, Junxian He, Yangqiu Song
Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks.
1 code implementation • 15 Apr 2024 • Yuzhen Huang, Jinghan Zhang, Zifei Shan, Junxian He
We open-source our compression datasets as well as our data collection pipelines to facilitate future researchers to assess compression properly.
1 code implementation • 3 Mar 2024 • Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He
Large language models (LLMs) frequently hallucinate and produce factual errors, yet our understanding of why they make these errors remains limited.
1 code implementation • 5 Feb 2024 • Zhiyuan Hu, Chumin Liu, Xidong Feng, Yilun Zhao, See-Kiong Ng, Anh Tuan Luu, Junxian He, Pang Wei Koh, Bryan Hooi
In the face of uncertainty, the ability to *seek information* is of fundamental importance.
2 code implementations • 24 Jan 2024 • Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He
Evaluating Large Language Models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications.
1 code implementation • 31 Dec 2023 • Zhouhan Lin, Cheng Deng, Le Zhou, Tianhang Zhang, Yi Xu, Yutong Xu, Zhongmou He, Yuanyuan Shi, Beiya Dai, Yunchong Song, Boyi Zeng, Qiyuan Chen, Yuxun Miao, Bo Xue, Shu Wang, Luoyi Fu, Weinan Zhang, Junxian He, Yunqiang Zhu, Xinbing Wang, Chenghu Zhou
To our best knowledge, it is the largest language model for the geoscience domain.
1 code implementation • 25 Dec 2023 • Wei Liu, Weihao Zeng, Keqing He, Yong Jiang, Junxian He
We present deita (short for Data-Efficient Instruction Tuning for Alignment), a series of models fine-tuned from LLaMA and Mistral models using data samples automatically selected with our proposed approach.
1 code implementation • 17 Dec 2023 • Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, Zhenguo Li
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.
1 code implementation • 5 Dec 2023 • Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Shieh, Junxian He
We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier.
2 code implementations • 31 Oct 2023 • Kaixin Li, Qisheng Hu, Xu Zhao, Hui Chen, Yuxi Xie, Tiedong Liu, Qizhe Xie, Junxian He
In this work, we explore the use of Large Language Models (LLMs) to edit code based on user instructions.
1 code implementation • NeurIPS 2023 • Shiqi Chen, Yiran Zhao, Jinghan Zhang, I-Chun Chern, Siyang Gao, PengFei Liu, Junxian He
In this benchmark, we collect responses generated from LLMs and annotate factuality labels in a fine-grained manner.
2 code implementations • 3 Aug 2023 • Keyu Duan, Qian Liu, Tat-Seng Chua, Shuicheng Yan, Wei Tsang Ooi, Qizhe Xie, Junxian He
More recently, with the rapid development of language models (LMs), researchers have focused on leveraging LMs to facilitate the learning of TGs, either by jointly training them in a computationally intensive framework (merging the two stages), or designing complex self-supervised training tasks for feature extraction (enhancing the first stage).
Ranked #1 on
Node Property Prediction
on ogbn-arxiv
3 code implementations • 25 Jul 2023 • I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, PengFei Liu
With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e. g., ChatGPT).
2 code implementations • 26 Jun 2023 • Jinghan Zhang, Shiqi Chen, Junteng Liu, Junxian He
In this paper, we propose to compose these parameter-efficient modules through linear arithmetic operations in the weight space, thereby integrating different module capabilities.
1 code implementation • 22 Jun 2023 • Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi
To better break down the problem, we define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.
1 code implementation • 8 Jun 2023 • Cheng Deng, Tianhang Zhang, Zhongmou He, Yi Xu, Qiyuan Chen, Yuanyuan Shi, Luoyi Fu, Weinan Zhang, Xinbing Wang, Chenghu Zhou, Zhouhan Lin, Junxian He
Large language models (LLMs) have achieved great success in general domains of natural language processing.
2 code implementations • 24 May 2023 • Junlei Zhang, Zhenzhong Lan, Junxian He
Contrastive learning has been the dominant approach to train state-of-the-art sentence embeddings.
1 code implementation • Github 2023 • Qisheng Hu*, Kaixin Li*, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie, Junxian He
In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring.
2 code implementations • 23 May 2023 • Shiqi Chen, Siyang Gao, Junxian He
Detecting factual errors in summaries has been an important and challenging subject in summarization research.
1 code implementation • 23 May 2023 • James Xu Zhao, Yuxi Xie, Kenji Kawaguchi, Junxian He, Michael Qizhe Xie
Chain-of-Thought (CoT) and Program-Aided Language Models (PAL) represent two distinct reasoning methods, each with its own strengths.
Ranked #2 on
Math Word Problem Solving
on SVAMP
1 code implementation • NeurIPS 2023 • Yuzhen Huang, Yuzhuo Bai, Zhihao Zhu, Junlei Zhang, Jinghan Zhang, Tangjun Su, Junteng Liu, Chuancheng Lv, Yikai Zhang, Jiayi Lei, Yao Fu, Maosong Sun, Junxian He
We present C-Eval, the first comprehensive Chinese evaluation suite designed to assess advanced knowledge and reasoning abilities of foundation models in a Chinese context.
no code implementations • NeurIPS 2023 • Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie
Stochastic beam search balances exploitation and exploration of the search space with temperature-controlled randomness.
7 code implementations • 21 Sep 2022 • Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer
The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.
Ranked #1 on
ListOps
on ListOps
no code implementations • 13 Sep 2022 • FatemehSadat Mireshghallah, Nikolai Vogler, Junxian He, Omar Florez, Ahmed El-Kishky, Taylor Berg-Kirkpatrick
User-generated social media data is constantly changing as new trends influence online discussion and personal information is deleted due to privacy concerns.
1 code implementation • 29 Apr 2022 • Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig
One of the most impressive results of recent NLP history is the ability of pre-trained language models to solve new tasks in a zero-shot setting.
2 code implementations • 28 Jan 2022 • Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.
1 code implementation • ICLR 2022 • Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig
Furthermore, our unified framework enables the transfer of design elements across different approaches, and as a result we are able to instantiate new parameter-efficient fine-tuning methods that tune less parameters than previous methods while being more effective, achieving comparable results to fine-tuning all parameters on all four tasks.
no code implementations • ICLR 2022 • Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn
Structural locality is a ubiquitous feature of real-world datasets, wherein data points are organized into local hierarchies.
1 code implementation • CoNLL (EMNLP) 2021 • Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan Bisk, Graham Neubig
Our experiments find that concreteness is a strong indicator for learning dependency grammars, improving the direct attachment score (DAS) by over 50\% as compared to state-of-the-art models trained on pure text.
2 code implementations • EMNLP 2021 • Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore, which allows them to learn through explicitly memorizing the training datapoints.
1 code implementation • 8 Dec 2020 • Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong
Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts.
3 code implementations • EMNLP 2020 • Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei LI
Pre-trained contextual representations like BERT have achieved great success in natural language processing.
Ranked #16 on
Semantic Textual Similarity
on STS16
1 code implementation • NeurIPS 2020 • Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig
While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus.
5 code implementations • ICLR 2020 • Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick
Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes.
1 code implementation • ICLR 2020 • Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato
In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks.
no code implementations • EACL 2021 • Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato
While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.
1 code implementation • IJCNLP 2019 • Bohan Li, Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick, Yiming Yang
In this paper, we investigate a simple fix for posterior collapse which yields surprisingly effective results.
1 code implementation • ACL 2019 • Junxian He, Zhisong Zhang, Taylor Berg-Kirkpatrick, Graham Neubig
The parameters of source model and target model are softly shared through a regularized log likelihood objective.
1 code implementation • ACL 2019 • Yu-Hsiang Lin, Chian-Yu Chen, Jean Lee, Zirui Li, Yuyan Zhang, Mengzhou Xia, Shruti Rijhwani, Junxian He, Zhisong Zhang, Xuezhe Ma, Antonios Anastasopoulos, Patrick Littell, Graham Neubig
Cross-lingual transfer, where a high-resource transfer language is used to improve the accuracy of a low-resource task language, is now an invaluable tool for improving performance of natural language processing (NLP) on low-resource languages.
2 code implementations • ICLR 2019 • Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick
The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique.
Ranked #1 on
Text Generation
on Yahoo Questions
4 code implementations • ACL 2019 • Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wangrong Zhu, Devendra Singh Sachan, Eric P. Xing
The versatile toolkit also fosters technique sharing across different text generation tasks.
1 code implementation • EMNLP 2018 • Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick
In this work, we propose a novel generative model that jointly learns discrete syntactic structure and continuous word representations in an unsupervised fashion by cascading an invertible neural network with a structured generative prior.
no code implementations • WS 2018 • Zhiting Hu, Zichao Yang, Tiancheng Zhao, Haoran Shi, Junxian He, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Lianhui Qin, Devendra Singh Chaplot, Bowen Tan, Xingjiang Yu, Eric Xing
The features make Texar particularly suitable for technique sharing and generalization across different text generation applications.
7 code implementations • ACL 2018 • Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig
Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.
no code implementations • 1 Jul 2017 • Junxian He, Zhiting Hu, Taylor Berg-Kirkpatrick, Ying Huang, Eric P. Xing
Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling.
no code implementations • 2 Oct 2016 • Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang
A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges.