no code implementations • 31 Mar 2025 • Ziming Cheng, Zhiyuan Huang, Junting Pan, Zhaohui Hou, Mingjie Zhan
Graphical user interfaces (GUI) automation agents are emerging as powerful tools, enabling humans to accomplish increasingly complex tasks on smart devices.
no code implementations • 5 Mar 2025 • Zhiyuan Huang, Ziming Cheng, Junting Pan, Zhaohui Hou, Mingjie Zhan
While they generally meet the requirements of compatibility and low latency, these vision-based GUI agents tend to have low accuracy due to their limitations in element grounding.
1 code implementation • 12 Dec 2024 • Delong Liu, Zhaohui Hou, Mingjie Zhan, Shihao Han, Zhicheng Zhao, Fei Su
Recently, diffusion-based video generation models have achieved significant success.
1 code implementation • 10 Oct 2024 • Zimu Lu, Aojun Zhou, Ke Wang, Houxing Ren, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li
Training several popular base models with this corpus significantly improves their mathematical abilities, leading to the creation of the MathCoder2 family of models.
1 code implementation • 30 Jun 2024 • Zimu Lu, Aojun Zhou, Ke Wang, Houxing Ren, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li
Direct Preference Optimization (DPO) has proven effective at improving the performance of large language models (LLMs) on downstream tasks such as reasoning and alignment.
1 code implementation • 27 May 2024 • Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Hongsheng Li
Alternately, some approaches considered character-level infilling, but they relied on predicting sub-tokens in inference, yet this strategy diminished ability in character-level infilling tasks due to the large perplexity of the model on sub-tokens.
1 code implementation • 27 May 2024 • Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Aojun Zhou, Junting Pan, Hongsheng Li
Inspired by this, we present ReflectionCoder, a novel approach that effectively leverages reflection sequences constructed by integrating compiler feedback to improve one-off code generation performance.
no code implementations • 26 Feb 2024 • Zimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li
We augment the ground-truth solutions of our seed data and train a back-translation model to translate the augmented solutions back into new questions.
1 code implementation • 22 Feb 2024 • Ke Wang, Junting Pan, Weikang Shi, Zimu Lu, Mingjie Zhan, Hongsheng Li
Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista.
Ranked #1 on
Multimodal Reasoning
on MATH-V
(using extra training data)
no code implementations • 25 Jan 2024 • Sichun Luo, Yuxuan Yao, Bowei He, Yinya Huang, Aojun Zhou, Xinyi Zhang, Yuanzhang Xiao, Mingjie Zhan, Linqi Song
Conventional recommendation methods have achieved notable advancements by harnessing collaborative or sequential information from user behavior.
1 code implementation • 26 Dec 2023 • Sichun Luo, Bowei He, Haohan Zhao, Wei Shao, Yanlin Qi, Yinya Huang, Aojun Zhou, Yuxuan Yao, Zongpeng Li, Yuanzhang Xiao, Mingjie Zhan, Linqi Song
Large Language Models (LLMs) have demonstrated remarkable capabilities and have been extensively deployed across various domains, including recommender systems.
no code implementations • 29 Oct 2023 • Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan
Based on TeacherLM-7. 1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting.
1 code implementation • 5 Oct 2023 • Ke Wang, Houxing Ren, Aojun Zhou, Zimu Lu, Sichun Luo, Weikang Shi, Renrui Zhang, Linqi Song, Mingjie Zhan, Hongsheng Li
In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities.
Ranked #6 on
Math Word Problem Solving
on SVAMP
(using extra training data)
1 code implementation • 15 Aug 2023 • Aojun Zhou, Ke Wang, Zimu Lu, Weikang Shi, Sichun Luo, Zipeng Qin, Shaoqing Lu, Anya Jia, Linqi Song, Mingjie Zhan, Hongsheng Li
We found that its success can be largely attributed to its powerful skills in generating and executing code, evaluating the output of code execution, and rectifying its solution when receiving unreasonable outputs.
Ranked #6 on
Math Word Problem Solving
on MATH
no code implementations • 13 May 2023 • Haochen Tan, Han Wu, Wei Shao, Xinyun Zhang, Mingjie Zhan, Zhaohui Hou, Ding Liang, Linqi Song
Meetings typically involve multiple participants and lengthy conversations, resulting in redundant and trivial content.
1 code implementation • 9 May 2023 • Han Wu, Mingjie Zhan, Haochen Tan, Zhaohui Hou, Ding Liang, Linqi Song
Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data.
1 code implementation • 29 May 2022 • Han Wu, Haochen Tan, Mingjie Zhan, Gangming Zhao, Shaoqing Lu, Ding Liang, Linqi Song
Existing dialogue modeling methods have achieved promising performance on various dialogue tasks with the aid of Transformer and the large-scale pre-trained language models.
no code implementations • 10 May 2021 • Zilong Wang, Mingjie Zhan, Houxing Ren, Zhaohui Hou, Yuwei Wu, Xingyan Zhang, Ding Liang
Forms are a common type of document in real life and carry rich information through textual contents and the organizational structure.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Zilong Wang, Mingjie Zhan, Xuebo Liu, Ding Liang
The table detection and handcrafted features in previous works cannot apply to all forms because of their requirements on formats.