Search Results for author: Trung Quoc Luong

Found 3 papers, 2 papers with code

ReFT: Reasoning with Reinforced Fine-Tuning

1 code implementation17 Jan 2024 Trung Quoc Luong, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran Jin, Hang Li

ReFT first warmups the model with SFT, and then employs on-line reinforcement learning, specifically the PPO algorithm in this paper, to further fine-tune the model, where an abundance of reasoning paths are automatically sampled given the question and the rewards are naturally derived from the ground-truth answers.

GSM8K Math +1

Design of Chain-of-Thought in Math Problem Solving

1 code implementation20 Sep 2023 Zhanming Jie, Trung Quoc Luong, Xinbo Zhang, Xiaoran Jin, Hang Li

We also find that Python is a better choice of language than Wolfram for program CoTs.

GSM8K Math

Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding

no code implementations16 May 2023 Shuwei Feng, Tianyang Zhan, Zhanming Jie, Trung Quoc Luong, Xiaoran Jin

This paper presents GenDoc, a general sequence-to-sequence document understanding model pre-trained with unified masking across three modalities: text, image, and layout.

Decoder document understanding +2

Cannot find the paper you are looking for? You can Submit a new open access paper.