no code implementations • ACL 2022 • Junlong Li, Yiheng Xu, Lei Cui, Furu Wei
Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images.
1 code implementation • 19 Feb 2024 • Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, PengFei Liu
This paper explores elevating the quality of existing instruction data to better align with human values, introducing a simple and effective approach named ReAlign, which reformats the responses of instruction data into a format that better aligns with pre-established criteria and the collated evidence.
1 code implementation • 17 Feb 2024 • Junlong Li, Fan Zhou, Shichao Sun, Yikai Zhang, Hai Zhao, PengFei Liu
As a relative quality comparison of model responses, human and Large Language Model (LLM) preferences serve as common alignment goals in model fine-tuning and criteria in evaluation.
1 code implementation • 13 Jan 2024 • Yikai Zhang, Junlong Li, PengFei Liu
Large Language Models (LLMs) are known to have limited extrapolation ability beyond their pre-trained context window, constraining their application in downstream tasks with lengthy inputs.
1 code implementation • 9 Jan 2024 • Shichao Sun, Junlong Li, Weizhe Yuan, Ruifeng Yuan, Wenjie Li, PengFei Liu
In this paper, we pioneer the critique of critique, termed MetaCritique, which is a framework to evaluate the critique from two aspects, i. e., factuality as precision score and comprehensiveness as recall score.
1 code implementation • 20 Oct 2023 • JinYuan Wang, Junlong Li, Hai Zhao
To further extend this task, we officially introduce open-domain multi-hop reasoning (ODMR) by answering multi-hop questions with explicit reasoning steps in open-domain setting.
1 code implementation • 9 Oct 2023 • Junlong Li, Shichao Sun, Weizhe Yuan, Run-Ze Fan, Hai Zhao, PengFei Liu
The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address.
1 code implementation • ICCV 2023 • Junlong Li, Bingyao Yu, Yongming Rao, Jie zhou, Jiwen Lu
The core of our method consists of a global instance assignment strategy and a spatio-temporal enhancement module, which improve the temporal consistency of the features from two aspects.
1 code implementation • CVPR 2023 • Chengkun Wang, Wenzhao Zheng, Junlong Li, Jie zhou, Jiwen Lu
Learning a generalizable and comprehensive similarity metric to depict the semantic discrepancies between images is the foundation of many computer vision tasks.
1 code implementation • 16 Dec 2022 • Junlong Li, JinYuan Wang, Zhuosheng Zhang, Hai Zhao
This task becomes notably challenging in a zero-shot setting where no data is available to train tailored retrieval-reader models.
2 code implementations • CVPR 2022 • Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie zhou, Jiwen Lu
Human behavior has the nature of indeterminacy, which requires the pedestrian trajectory prediction system to model the multi-modality of future motion states.
3 code implementations • 4 Mar 2022 • Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei
We leverage DiT as the backbone network in a variety of vision-based Document AI tasks, including document image classification, document layout analysis, table detection as well as text detection for OCR.
Ranked #1 on Table Detection on ICDAR 2019
2 code implementations • 16 Oct 2021 • Junlong Li, Yiheng Xu, Lei Cui, Furu Wei
Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images.
1 code implementation • ICCV 2021 • Guangyi Chen, Junlong Li, Jiwen Lu, Jie zhou
Most existing methods learn to predict future trajectories by behavior clues from history trajectories and interaction clues from environments.
1 code implementation • ICCV 2021 • Guangyi Chen, Junlong Li, Nuoxing Zhou, Liangliang Ren, Jiwen Lu
In this paper, we present a distribution discrimination (DisDis) method to predict personalized motion patterns by distinguishing the potential distributions.
no code implementations • 10 Feb 2021 • Zhuosheng Zhang, Junlong Li, Hai Zhao
Experimental results on four dialogue comprehension benchmark tasks show that our proposed model achieves great improvements on baselines.
1 code implementation • 10 Sep 2020 • Junlong Li, Zhuosheng Zhang, Hai Zhao
Pre-trained language models (PrLMs) have achieved great success on a wide range of natural language processing tasks by virtue of the universal language representation ability obtained by self-supervised learning on a large corpus.
no code implementations • 29 Apr 2020 • Junlong Li, Zhuosheng Zhang, Hai Zhao
In this paper, the relevance of each turn to the question are calculated to choose key turns.