no code implementations • 12 Feb 2025 • Jihoon Tack, Jack Lanchantin, Jane Yu, Andrew Cohen, Ilia Kulikov, Janice Lan, Shibo Hao, Yuandong Tian, Jason Weston, Xian Li
We propose Continuous Concept Mixing (CoCoMix), a novel pretraining framework that combines discrete next token prediction with continuous concepts.
1 code implementation • 6 Feb 2025 • Letian Peng, Chenyang An, Shibo Hao, chengyu dong, Jingbo Shang
The generalization of language models (LMs) is undergoing active debates, contrasting their potential for general intelligence with their struggles with basic knowledge composition (e. g., reverse/transition curse).
2 code implementations • 20 Dec 2024 • Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu
While Direct Preference Optimization (DPO) has shown promise in aligning LLMs with human preferences, it is less suitable for multi-step reasoning tasks because (1) DPO relies on paired preference data, which is not readily available for multi-step reasoning tasks, and (2) it treats all tokens uniformly, making it ineffective for credit assignment in multi-step reasoning tasks, which often come with sparse reward.
1 code implementation • 9 Dec 2024 • Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian
For example, most word tokens are primarily for textual coherence and not essential for reasoning, while some critical tokens require complex planning and pose huge challenges to LLMs.
no code implementations • 12 Jun 2024 • Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
This paper makes a step towards building a general world model by introducing Pandora, a hybrid autoregressive-diffusion model that simulates world states by generating videos and allows real-time control with free-text actions.
1 code implementation • 9 Jun 2024 • Fangxu Yu, Lai Jiang, Haoqiang Kang, Shibo Hao, Lianhui Qin
To fill this gap, we propose Flow of Reasoning (FoR), an efficient diversity-seeking LLM finetuning method aimed at improving reasoning quality and diversity with minimal data.
1 code implementation • 8 Apr 2024 • Shibo Hao, Yi Gu, Haotian Luo, Tianyang Liu, Xiyan Shao, Xinyuan Wang, Shuhua Xie, Haodi Ma, Adithya Samavedhi, Qiyue Gao, Zhen Wang, Zhiting Hu
(2) We develop LLM Reasoners, a library for standardized modular implementation of existing and new reasoning algorithms, under a unified formulation of the search, reward, and world model components.
3 code implementations • 24 May 2023 • Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, Daisy Zhe Wang, Zhiting Hu
RAP on LLAMA-33B surpasses CoT on GPT-4 with 33% relative improvement in a plan generation setting.
1 code implementation • NeurIPS 2023 • Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
Augmenting large language models (LLMs) with external tools has emerged as a promising approach to solving complex problems.
2 code implementations • 28 Jun 2022 • Shibo Hao, Bowen Tan, Kaiwen Tang, Bin Ni, Xiyan Shao, Hengzhe Zhang, Eric P. Xing, Zhiting Hu
The resulting KGs as a symbolic interpretation of the source LMs also reveal new insights into the LMs' knowledge capacities.
1 code implementation • ACL 2022 • Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, Dongyan Zhao
DocRED is a widely used dataset for document-level relation extraction.
2 code implementations • EMNLP 2021 • Tianqing Fang, Weiqi Wang, Sehyun Choi, Shibo Hao, Hongming Zhang, Yangqiu Song, Bin He
Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task.