Search Results for author: Jipeng Zhang

Found 46 papers, 30 papers with code

ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects

no code implementations22 May 2025 Jipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, Xiaofang Zhou

However, real-world applications require SQL generation across multiple dialects with varying syntax and specialized features, which remains a challenge for current models.

Text to SQL Text-To-SQL

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

1 code implementation15 May 2025 Zhaowei Wang, Wenhao Yu, Xiyu Ren, Jipeng Zhang, Yu Zhao, Rohit Saxena, Liang Cheng, Ginny Wong, Simon See, Pasquale Minervini, Yangqiu Song, Mark Steedman

The rapid extension of context windows in large vision-language models has given rise to long-context vision-language models (LCVLMs), which are capable of handling hundreds of images with interleaved text tokens in a single forward pass.

8k Benchmarking +1

DIDS: Domain Impact-aware Data Sampling for Large Language Model Training

no code implementations17 Apr 2025 Weijie Shi, Jipeng Zhang, Yaguang Wu, Jingzhi Fang, Ruiyuan Zhang, Jiajie Xu, Jia Zhu, Hao Chen, Yao Zhao, Sirui Han, Xiaofang Zhou

Large language models (LLMs) are commonly trained on multi-domain datasets, where domain sampling strategies significantly impact model performance due to varying domain importance across downstream tasks.

Dimensionality Reduction Language Modeling +2

Benchmarking Multi-National Value Alignment for Large Language Models

no code implementations17 Apr 2025 Weijie Shi, Chengyi Ju, Chengzhong Liu, Jiaming Ji, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Yaodong Yang, Sirui Han, Yike Guo

Moreover, we demonstrate that NaVAB can be combined with alignment techniques to effectively reduce value concerns by aligning LLMs' values with the target country.

Benchmarking

MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving

1 code implementation5 Mar 2025 Ruida Wang, Rui Pan, Yuxin Li, Jipeng Zhang, Yizhen Jia, Shizhe Diao, Renjie Pi, Junjie Hu, Tong Zhang

To solve these issues, we propose MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought framework, (to the best of our knowledge), the first multi-agent framework for Lean4 theorem proving that balance high-level NL reasoning and FL verification in Long CoT.

Automated Theorem Proving Transfer Learning

Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training

no code implementations5 Feb 2025 Boyao Wang, Rui Pan, Shizhe Diao, Xingyuan Pan, Jipeng Zhang, Renjie Pi, Tong Zhang

Small language models (SLMs) have attracted considerable attention from both academia and industry due to their broad range of applications in edge devices.

Language Modeling Language Modelling +2

SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation

no code implementations13 Dec 2024 Runtao Liu, Chen I Chieh, Jindong Gu, Jipeng Zhang, Renjie Pi, Qifeng Chen, Philip Torr, Ashkan Khakzar, Fabio Pizzati

Using a custom DPO strategy and this dataset, we train safety experts, in the form of low-rank adaptation (LoRA) matrices, able to guide the generation process away from specific safety-related concepts.

Safety Alignment Text to Image Generation +1

WSI-LLaVA: A Multimodal Large Language Model for Whole Slide Image

no code implementations3 Dec 2024 Yuci Liang, Xinheng Lyu, Meidan Ding, WenTing Chen, Jipeng Zhang, Yuexiang Ren, Xiangjian He, Song Wu, Sen yang, Xiyue Wang, Xiaohan Xing, Linlin Shen

Recent advancements in computational pathology have produced patch-level Multi-modal Large Language Models (MLLMs), but these models are limited by their inability to analyze whole slide images (WSIs) comprehensively and their tendency to bypass crucial morphological features that pathologists rely on for diagnosis.

Diagnostic Language Modeling +6

Fox-1 Technical Report

no code implementations8 Nov 2024 Zijian Hu, Jipeng Zhang, Rui Pan, Zhaozhuo Xu, Shanshan Han, Han Jin, Alay Dilipbhai Shah, Dimitris Stripelis, Yuhang Yao, Salman Avestimehr, Chaoyang He, Tong Zhang

Aiming to improve the pre-training efficiency, Fox-1-1. 6B model introduces a novel 3-stage data curriculum across all the training data with 2K-8K sequence length.

2k 8k +1

Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs

no code implementations7 Nov 2024 Yide Ran, Zhaozhuo Xu, Yuhang Yao, Zijian Hu, Shanshan Han, Han Jin, Alay Dilipbhai Shah, Jipeng Zhang, Dimitris Stripelis, Tong Zhang, Salman Avestimehr, Chaoyang He

The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance.

Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code

no code implementations24 Oct 2024 Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang

The underlying cause of this issue is the gap between natural language to programming language gap (NL-PL Gap), which is especially pronounced in LRPLs due to limited aligned data.

General Knowledge In-Context Learning

Personalized Visual Instruction Tuning

1 code implementation9 Oct 2024 Renjie Pi, Jianshu Zhang, Tianyang Han, Jipeng Zhang, Rui Pan, Tong Zhang

In this paper, we introduce Personalized Visual Instruction Tuning (PVIT), a novel data curation and training framework designed to enable MLLMs to identify target individuals within an image and engage in personalized and coherent dialogues.

Image Generation

FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation

1 code implementation22 Aug 2024 Kashun Shum, Minrui Xu, Jianshu Zhang, Zixin Chen, Shizhe Diao, Hanze Dong, Jipeng Zhang, Muhammad Omer Raza

Then we further propose a brand new method named Efficient Trustworthy Distillation (FIRST), which utilizes a small portion of teacher's knowledge to obtain a reliable language model in a cost-efficient way.

Language Modeling Language Modelling +1

TensorOpera Router: A Multi-Model Router for Efficient LLM Inference

no code implementations22 Aug 2024 Dimitris Stripelis, Zijian Hu, Jipeng Zhang, Zhaozhuo Xu, Alay Dilipbhai Shah, Han Jin, Yuhang Yao, Salman Avestimehr, Chaoyang He

With the rapid growth of Large Language Models (LLMs) across various domains, numerous new LLMs have emerged, each possessing domain-specific expertise.

TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data

1 code implementation21 Jul 2024 Jipeng Zhang, Yaxuan Qin, Renjie Pi, Weizhong Zhang, Rui Pan, Tong Zhang

Achieving this goal poses non-trivial challenges: 1) data selection requires accurate data representations that reflect the training samples' quality, 2) considering the diverse nature of instruction datasets, and 3) ensuring the efficiency of the coreset selection algorithm for large models.

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

1 code implementation3 Jul 2024 Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

However, due to the scarcity of aligned NL and Formal Language (FL) theorem-proving data most modern LLMs exhibit suboptimal performance. This scarcity results in a paucity of methodologies for training LLMs and techniques to fully utilize their capabilities in composing formal proofs.

Automated Theorem Proving Code Generation +2

ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

no code implementations28 Jun 2024 Rui Pan, Dylan Zhang, Hanning Zhang, Xingyuan Pan, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang

Bilevel optimization has shown its utility across various machine learning settings, yet most algorithms in practice require second-order information, making it challenging to scale them up.

Bilevel Optimization Instruction Following +1

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

1 code implementation11 Jun 2024 Renjie Pi, Jianshu Zhang, Jipeng Zhang, Rui Pan, Zhekai Chen, Tong Zhang

Image description datasets play a crucial role in the advancement of various applications such as image understanding, text-to-image generation, and text-image retrieval.

Hallucination Image Retrieval +2

Process-Driven Autoformalization in Lean 4

2 code implementations4 Jun 2024 Jianqiao Lu, Yingjia Wan, Zhengying Liu, Yinya Huang, Jing Xiong, Chengwu Liu, Jianhao Shen, Hui Jin, Jipeng Zhang, Haiming Wang, Zhicheng Yang, Jing Tang, Zhijiang Guo

Autoformalization, the conversion of natural language mathematics into formal languages, offers significant potential for advancing mathematical reasoning.

Mathematical Reasoning

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

1 code implementation26 Mar 2024 Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang

Attempting to complement this deficiency, we investigate the layerwise properties of LoRA on fine-tuning tasks and observe an unexpected but consistent skewness of weight norms across different layers.

GSM8K Language Modeling +5

Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization

no code implementations13 Mar 2024 Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang

To mitigate this issue, we propose Bootstrapped Preference Optimization (BPO), which conducts preference learning with datasets containing negative responses bootstrapped from the model itself.

Language Modeling Language Modelling +3

The Instinctive Bias: Spurious Images lead to Illusion in MLLMs

1 code implementation6 Feb 2024 Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang

In this paper, we identify a typical class of inputs that baffles MLLMs, which consist of images that are highly relevant but inconsistent with answers, causing MLLMs to suffer from visual illusion.

Hallucination

PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs

1 code implementation31 Jan 2024 Ying Su, Jipeng Zhang, Yangqiu Song, Tong Zhang

To facilitate the evaluation of pruned subgraphs, we also propose a graph attention network (GAT) based module to reason with the subgraph data.

Graph Attention Knowledge Graphs +1

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

1 code implementation5 Jan 2024 Renjie Pi, Tianyang Han, Jianshu Zhang, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

The deployment of multimodal large language models (MLLMs) has brought forth a unique vulnerability: susceptibility to malicious attacks through visual inputs.

Safety Alignment

Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study

no code implementations4 Jan 2024 Ziqiang Zheng, YiWei Chen, Jipeng Zhang, Tuan-Anh Vu, Huimin Zeng, Yue Him Wong Tim, Sai-Kit Yeung

In this study, we carry out the preliminary and comprehensive case study of utilizing GPT-4V for marine analysis.

G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

3 code implementations18 Dec 2023 Jiahui Gao, Renjie Pi, Jipeng Zhang, Jiacheng Ye, Wanjun Zhong, YuFei Wang, Lanqing Hong, Jianhua Han, Hang Xu, Zhenguo Li, Lingpeng Kong

We first analyze the limitations of current Multimodal Large Language Models (MLLMs) in this area: they struggle to accurately comprehending basic geometric elements and their relationships.

Language Modeling Language Modelling +2

Plum: Prompt Learning using Metaheuristic

1 code implementation14 Nov 2023 Rui Pan, Shuo Xing, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang

Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models.

Image Generation Prompt Learning

PerceptionGPT: Effectively Fusing Visual Perception into LLM

no code implementations CVPR 2024 Renjie Pi, Lewei Yao, Jiahui Gao, Jipeng Zhang, Tong Zhang

In this paper, we present a novel end-to-end framework named PerceptionGPT, which efficiently and effectively equips the VLLMs with visual perception abilities by leveraging the representation power of LLMs' token embedding.

MarineGPT: Unlocking Secrets of Ocean to the Public

1 code implementation20 Oct 2023 Ziqiang Zheng, Jipeng Zhang, Tuan-Anh Vu, Shizhe Diao, Yue Him Wong Tim, Sai-Kit Yeung

Large language models (LLMs), such as ChatGPT/GPT-4, have proven to be powerful tools in promoting the user experience as an AI assistant.

Language Modelling

Non-Autoregressive Sentence Ordering

1 code implementation19 Oct 2023 Yi Bin, Wenhao Shi, Bin Ji, Jipeng Zhang, Yujuan Ding, Yang Yang

Existing sentence ordering approaches generally employ encoder-decoder frameworks with the pointer net to recover the coherence by recurrently predicting each sentence step-by-step.

Decoder Sentence +1

Mitigating the Alignment Tax of RLHF

1 code implementation12 Sep 2023 Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan YAO, Tong Zhang

Building on the analysis and the observation that averaging different layers of the transformer leads to significantly different alignment-forgetting trade-offs, we propose Heterogeneous Model Averaging (HMA) to Heterogeneously find various combination ratios of model layers.

Common Sense Reasoning Continual Learning

tdCoxSNN: Time-Dependent Cox Survival Neural Network for Continuous-time Dynamic Prediction

1 code implementation12 Jul 2023 Lang Zeng, Jipeng Zhang, Wei Chen, Ying Ding

In pursuit of constructing a dynamic prediction model for a progressive eye disorder, age-related macular degeneration (AMD), we propose a time-dependent Cox survival neural network (tdCoxSNN) to predict its progression using longitudinal fundus images.

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

1 code implementation21 Jun 2023 Shizhe Diao, Rui Pan, Hanze Dong, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang

As the number of available foundation models and specialized tasks keeps growing, the job of training scientific language models becomes highly nontrivial.

DetGPT: Detect What You Need via Reasoning

1 code implementation23 May 2023 Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang

Overall, our proposed paradigm and DetGPT demonstrate the potential for more sophisticated and intuitive interactions between humans and machines.

Autonomous Driving Object +2

Effective Bilevel Optimization via Minimax Reformulation

no code implementations22 May 2023 Xiaoyu Wang, Rui Pan, Renjie Pi, Jipeng Zhang

To address this issue, we propose a reformulation of bilevel optimization as a minimax problem, effectively decoupling the outer-inner dependency.

Bilevel Optimization Meta-Learning

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

1 code implementation13 Apr 2023 Hanze Dong, Wei Xiong, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang

Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples.

Ethics

Analogical Math Word Problems Solving with Enhanced Problem-Solution Association

1 code implementation1 Dec 2022 Zhenwen Liang, Jipeng Zhang, Xiangliang Zhang

In this paper, we propose to build a novel MWP solver by leveraging analogical MWPs, which advance the solver's generalization ability across different kinds of MWPs.

Math Question Answering

Generalizing Math Word Problem Solvers via Solution Diversification

1 code implementation1 Dec 2022 Zhenwen Liang, Jipeng Zhang, Lei Wang, Yan Wang, Jie Shao, Xiangliang Zhang

In this paper, we design a new training framework for an MWP solver by introducing a solution buffer and a solution discriminator.

Math

X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks

2 code implementations22 Nov 2022 Yan Zeng, Xinsong Zhang, Hang Li, Jiawei Wang, Jipeng Zhang, Wangchunshu Zhou

Vision language pre-training aims to learn alignments between vision and language from a large amount of data.

 Ranked #1 on Cross-Modal Retrieval on Flickr30k (using extra training data)

All Cross-Modal Retrieval +8

Execution-based Evaluation for Data Science Code Generation Models

1 code implementation17 Nov 2022 JunJie Huang, Chenglong Wang, Jipeng Zhang, Cong Yan, Haotian Cui, Jeevana Priya Inala, Colin Clement, Nan Duan, Jianfeng Gao

Code generation models can benefit data scientists' productivity by automatically generating code from context and text descriptions.

Code Generation Form +1

Graph-to-Tree Learning for Solving Math Word Problems

1 code implementation ACL 2020 Jipeng Zhang, Lei Wang, Roy Ka-Wei Lee, Yi Bin, Yan Wang, Jie Shao, Ee-Peng Lim

While the recent tree-based neural models have demonstrated promising results in generating solution expression for the math word problem (MWP), most of these models do not capture the relationships and order information among the quantities well.

Decoder Math +1

Template-based math word problem solvers with recursive neural networks

1 code implementation AAAI 2019 Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bing Tian Dai, Heng Tao Shen

Then, we design a recursive neural network to encode the quantity with Bi-LSTM and self attention, and infer the unknown operator nodes in a bottom-up manner.

Math

Cannot find the paper you are looking for? You can Submit a new open access paper.