Search Results for author: Kaixuan Ji

Found 10 papers, 3 papers with code

P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks

no code implementations • ACL 2022 • Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

Paper
Add Code

Self-Play Preference Optimization for Language Model Alignment

no code implementations • 1 May 2024 • Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu

Traditional reinforcement learning from human feedback (RLHF) approaches relying on parametric models like the Bradley-Terry model fall short in capturing the intransitivity and irrationality in human preferences.

Language Modelling

Paper
Add Code

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

no code implementations • 15 Feb 2024 • Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu

Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs).

Reinforcement Learning (RL) Text-to-Image Generation

Paper
Add Code

Reinforcement Learning from Human Feedback with Active Queries

no code implementations • 14 Feb 2024 • Kaixuan Ji, Jiafan He, Quanquan Gu

Aligning large language models (LLM) with human preference plays a key role in building modern generative models and can be achieved by reinforcement learning from human feedback (RLHF).

Active Learning reinforcement-learning

Paper
Add Code

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

2 code implementations • 2 Jan 2024 • Zixiang Chen, Yihe Deng, Huizhuo Yuan, Kaixuan Ji, Quanquan Gu

In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data.

1,254

Paper
Code

Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

no code implementations • 16 Oct 2023 • Ji Qi, Kaixuan Ji, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Lei Hou, Juanzi Li, Bin Xu

Open Information Extraction (OIE) aims to extract objective structured knowledge from natural texts, which has attracted growing attention to build dedicated models with human experience.

In-Context Learning Open Information Extraction

Paper
Add Code

VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools

no code implementations • 16 Oct 2023 • Ji Qi, Kaixuan Ji, Jifan Yu, Duokang Wang, Bin Xu, Lei Hou, Juanzi Li

Building models that comprehends videos and responds specific user instructions is a practical and challenging topic, as it requires mastery of both vision understanding and knowledge reasoning.

Caption Generation Descriptive +3

Paper
Add Code

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

no code implementations • 15 May 2023 • Kaixuan Ji, Qingyue Zhao, Jiafan He, Weitong Zhang, Quanquan Gu

Recent studies have shown that episodic reinforcement learning (RL) is no harder than bandits when the total reward is bounded by $1$, and proved regret bounds that have a polylogarithmic dependence on the planning horizon $H$.

Open-Ended Question Answering reinforcement-learning +1

Paper
Add Code

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

2 code implementations • 14 Jul 2022 • Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, Jie Tang

By updating only 0. 1% of the model parameters, the prompt tuning strategy can help retrieval models achieve better generalization performance than traditional methods in which all parameters are updated.

Retrieval Text Retrieval +1

1,910

Paper
Code

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

4 code implementations • 14 Oct 2021 • Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

2,528

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.