Search Results for author: Zefan Cai

Found 22 papers, 15 papers with code

No Preference Left Behind: Group Distributional Preference Optimization

1 code implementation28 Dec 2024 Binwei Yao, Zefan Cai, Yun-Shiuan Chuang, Shanglin Yang, Ming Jiang, Diyi Yang, Junjie Hu

To address this issue, we propose Group Distribution Preference Optimization (GDPO), a novel framework that aligns language models with the distribution of preferences within a group by incorporating the concept of beliefs that shape individual preferences.

Diversity Language Modeling +1

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

1 code implementation16 Dec 2024 Liang Chen, Zekun Wang, Shuhuai Ren, Lei LI, Haozhe Zhao, Yunshui Li, Zefan Cai, Hongcheng Guo, Lei Zhang, Yizhe Xiong, Yichi Zhang, Ruoyu Wu, Qingxiu Dong, Ge Zhang, Jian Yang, Lingwei Meng, Shujie Hu, Yulong Chen, Junyang Lin, Shuai Bai, Andreas Vlachos, Xu Tan, Minjia Zhang, Wen Xiao, Aaron Yee, Tianyu Liu, Baobao Chang

As Large Language Models (LLMs) have advanced to unify understanding and generation tasks within the textual modality, recent research has shown that tasks from different modalities can also be effectively encapsulated within the NTP framework, transforming the multimodal information into tokens and predict the next one given the context.

Language Modeling Language Modelling +1

Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning

1 code implementation25 Oct 2024 Yu Fu, Zefan Cai, Abedelkadir Asi, Wayne Xiong, Yue Dong, Wen Xiao

Key-Value (KV) caching is a common technique to enhance the computational efficiency of Large Language Models (LLMs), but its memory overhead grows rapidly with input length.

Computational Efficiency Question Answering +1

COMMA: A Communicative Multimodal Multi-Agent Benchmark

no code implementations10 Oct 2024 Timothy Ossowski, Jixuan Chen, Danyal Maqbool, Zefan Cai, Tyler Bradshaw, Junjie Hu

Existing agentic benchmarks fail to address key aspects of inter-agent communication and collaboration, particularly in scenarios where agents have unequal access to information and must work together to achieve tasks beyond the scope of individual capabilities.

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

2 code implementations10 Oct 2024 Bofei Gao, Feifan Song, Zhe Yang, Zefan Cai, Yibo Miao, Qingxiu Dong, Lei LI, Chenghao Ma, Liang Chen, Runxin Xu, Zhengyang Tang, Benyou Wang, Daoguang Zan, Shanghaoran Quan, Ge Zhang, Lei Sha, Yichang Zhang, Xuancheng Ren, Tianyu Liu, Baobao Chang

However, existing benchmarks like GSM8K or MATH are now being solved with high accuracy (e. g., OpenAI o1 achieves 94. 8\% on MATH dataset), indicating their inadequacy for truly challenging these models.

GSM8K Math +1

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

1 code implementation2 Oct 2024 Liang Chen, Sinan Tan, Zefan Cai, Weichu Xie, Haozhe Zhao, Yichi Zhang, Junyang Lin, Jinze Bai, Tianyu Liu, Baobao Chang

This work tackles the information loss bottleneck of vector-quantization (VQ) autoregressive image generation by introducing a novel model architecture called the 2-Dimensional Autoregression (DnD) Transformer.

Image Generation Quantization

LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

1 code implementation20 Jun 2024 Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

In recent progress, mathematical verifiers have achieved success in mathematical reasoning tasks by validating the correctness of solutions generated by policy models.

Binary Classification GSM8K +2

PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

2 code implementations4 Jun 2024 Zefan Cai, Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao

Our experimental evaluations, utilizing the LongBench benchmark, show that PyramidKV matches the performance of models with a full KV cache while retaining only 12% of the KV cache, thus significantly reducing memory usage.

Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation

1 code implementation12 Apr 2024 Haozhe Zhao, Zefan Cai, Shuzheng Si, Liang Chen, Yufeng He, Kaikai An, Baobao Chang

Therefore, we introduce ALSACE to leverage the learned knowledge from the well-performing languages to guide under-performing ones within the same mPLM, eliminating the need for additional labeled multilingual data.

Improving Event Definition Following For Zero-Shot Event Detection

no code implementations5 Mar 2024 Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng

We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types.

Event Detection Event Extraction

PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain

1 code implementation21 Feb 2024 Liang Chen, Yichi Zhang, Shuhuai Ren, Haozhe Zhao, Zefan Cai, Yuchi Wang, Peiyi Wang, Xiangdi Meng, Tianyu Liu, Baobao Chang

To address this, we introduce Embodied-Instruction-Evolution (EIE), an automatic framework for synthesizing instruction tuning examples in multimodal embodied environments.

Autonomous Driving Decision Making

ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code

1 code implementation16 Nov 2023 Xiangru Tang, Yuliang Liu, Zefan Cai, Yanjun Shao, Junjie Lu, Yichi Zhang, Zexuan Deng, Helan Hu, Kaikai An, Ruijun Huang, Shuzheng Si, Sheng Chen, Haozhe Zhao, Liang Chen, Yan Wang, Tianyu Liu, Zhiwei Jiang, Baobao Chang, Yin Fang, Yujia Qin, Wangchunshu Zhou, Yilun Zhao, Arman Cohan, Mark Gerstein

Despite Large Language Models (LLMs) like GPT-4 achieving impressive results in function-level code generation, they struggle with repository-scale code understanding (e. g., coming up with the right arguments for calling routines), requiring a deeper comprehension of complex file interactions.

Code Generation Navigate +1

MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning

2 code implementations14 Sep 2023 Haozhe Zhao, Zefan Cai, Shuzheng Si, Xiaojian Ma, Kaikai An, Liang Chen, Zixuan Liu, Sheng Wang, Wenjuan Han, Baobao Chang

In this paper, we address the limitation above by 1) introducing vision-language Model with Multi-Modal In-Context Learning(MMICL), a new approach to allow the VLM to deal with multi-modal inputs efficiently; 2) proposing a novel context scheme to augment the in-context learning ability of the VLM; 3) constructing the Multi-modal In-Context Learning (MIC) dataset, designed to enhance the VLM's ability to understand complex multi-modal prompts.

Hallucination In-Context Learning +4

Human-in-the-Loop through Chain-of-Thought

no code implementations10 Jun 2023 Zefan Cai, Baobao Chang, Wenjuan Han

While the emergence of powerful language models along with Chain-of-thought prompting has made automation more and more omnipresent, it sometimes demonstrates its weakness in long-term or multi-step logical reasoning.

Logical Reasoning

Large Language Models are not Fair Evaluators

1 code implementation29 May 2023 Peiyi Wang, Lei LI, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Qi Liu, Tianyu Liu, Zhifang Sui

In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large language models~(LLMs), e. g., GPT-4, as a referee to score and compare the quality of responses generated by candidate models.

Language Modelling Large Language Model +1

DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade

no code implementations24 May 2023 Zefan Cai, Xin Zheng, Tianyu Liu, Xu Wang, Haoran Meng, Jiaqi Han, Gang Yuan, Binghuai Lin, Baobao Chang, Yunbo Cao

In the constant updates of the product dialogue systems, we need to retrain the natural language understanding (NLU) model as new data from the real users would be merged into the existent data accumulated in the last updates.

Intent Detection Multi-Label Classification +2

DiffCap: Exploring Continuous Diffusion on Image Captioning

no code implementations20 May 2023 Yufeng He, Zefan Cai, Xu Gan, Baobao Chang

Our method transforms discrete tokens in a natural way and applies continuous diffusion on them to successfully fuse extracted image features for diffusion caption generation.

Caption Generation Diversity +3

Cannot find the paper you are looking for? You can Submit a new open access paper.