no code implementations • 14 Mar 2025 • Yifang Chen, Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Yu Tian
The key-value (KV) cache in autoregressive transformers presents a significant bottleneck during inference, which restricts the context length capabilities of large language models (LLMs).
no code implementations • 23 Dec 2024 • Yifang Chen, Jiayan Huo, Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song
The Rotary Position Embedding (RoPE) mechanism has become a powerful enhancement to the Transformer architecture, which enables models to capture token relationships when encoding positional information.
no code implementations • 9 Dec 2024 • Yifang Chen, Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song
In this paper, we analyze the computational limitations of Mamba and State-space Models (SSMs) by using the circuit complexity framework.
no code implementations • 27 Oct 2024 • Yifang Chen, David Zhu, Simon Du, Kevin Jamieson, Yang Liu
Recent advances in large language model (LLM) training have highlighted the need for diverse, high-quality instruction data.
no code implementations • 19 Jul 2024 • Zaiqiao Meng, Hao Zhou, Yifang Chen
Visual Language Models (VLMs) are essential for various tasks, particularly visual reasoning tasks, due to their robust multi-modal information integration, visual reasoning capabilities, and contextual awareness.
no code implementations • 2 Jul 2024 • Yifang Chen, Shuohang Wang, ZiYi Yang, Hiteshi Sharma, Nikos Karampatziakis, Donghan Yu, Kevin Jamieson, Simon Shaolei Du, Yelong Shen
Reinforcement learning with human feedback (RLHF), as a widely adopted approach in current large language model pipelines, is \textit{bottlenecked by the size of human preference data}.
1 code implementation • 27 Jun 2024 • Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon S. Du
Unlike traditional methods that require careful curation of a mixture of datasets to achieve comprehensive improvement, we can quickly experiment with preference weightings using MOD to find the best combination of models.
2 code implementations • 29 May 2024 • Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du
Three main data selection approaches are: (1) leveraging external non-CLIP models to aid data selection, (2) training new CLIP-style embedding models that are more effective at selecting high-quality data than the original OpenAI CLIP model, and (3) designing better metrics or strategies universally applicable to any CLIP embedding without requiring specific model properties (e. g., CLIPScore is one popular metric).
2 code implementations • 3 Feb 2024 • Yiping Wang, Yifang Chen, Wendan Yan, Kevin Jamieson, Simon Shaolei Du
In recent years, data selection has emerged as a core issue for large-scale visual-language model pretraining, especially on noisy web-curated datasets.
no code implementations • 12 Jan 2024 • Gantavya Bhatt, Yifang Chen, Arnav M. Das, Jifan Zhang, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey Bilmes, Simon S. Du, Kevin Jamieson, Jordan T. Ash, Robert D. Nowak
To mitigate the annotation cost of SFT and circumvent the computational bottlenecks of active learning, we propose using experimental design.
1 code implementation • 16 Jun 2023 • Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Arnav M. Das, Gantavya Bhatt, Yinglun Zhu, Jeffrey Bilmes, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak
Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive.
no code implementations • 5 Jun 2023 • Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du
In addition to our sample complexity results, we also characterize the potential of our $\nu^1$-based strategy in sample-cost-sensitive settings.
no code implementations • 16 Nov 2022 • Jingwen Zhang, Yifang Chen, Amandeep Singh
To this end, in this paper, we consider the problem of online learning in linear stochastic contextual bandit problems with endogenous covariates.
no code implementations • 4 Nov 2022 • Yifang Chen, Karthik Sankararaman, Alessandro Lazaric, Matteo Pirotta, Dmytro Karamshuk, Qifan Wang, Karishma Mandyam, Sinong Wang, Han Fang
We design a novel algorithmic template, Weak Labeler Active Cover (WL-AC), that is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy.
no code implementations • 5 May 2022 • Mingyu Lu, Yifang Chen, Su-In Lee
Learning personalized cancer treatment with machine learning holds great promise to improve cancer patients' chance of survival.
no code implementations • 2 Feb 2022 • Yifang Chen, Simon S. Du, Kevin Jamieson
To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications.
no code implementations • 26 Jan 2022 • Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson
We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/\epsilon^2)$.
no code implementations • 7 Dec 2021 • Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson
Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.
no code implementations • NeurIPS 2021 • Yifang Chen, Simon S. Du, Kevin Jamieson
We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions.
no code implementations • 13 Feb 2021 • Yifang Chen, Simon S. Du, Kevin Jamieson
We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system.
no code implementations • 1 Jun 2020 • Yifang Chen, Xin Wang
This regret bound depends only on the maximum rank $M$ of measurements rather than the number of qubits, which takes advantage of low-rank measurements.
no code implementations • 13 Dec 2019 • Yifang Chen, Alex Cuellar, Haipeng Luo, Jignesh Modi, Heramb Nemlekar, Stefanos Nikolaidis
We introduce a Multi-Armed Bandit algorithm with fairness constraints, where fairness is defined as a minimum rate that a task or a resource is assigned to a user.
no code implementations • 3 Dec 2019 • Abhishek Roy, Yifang Chen, Krishnakumar Balasubramanian, Prasant Mohapatra
We establish sub-linear regret bounds on the proposed notions of regret in both the online and bandit setting.
no code implementations • 30 Jun 2019 • Houston Claure, Yifang Chen, Jignesh Modi, Malte Jung, Stefanos Nikolaidis
How should a robot that collaborates with multiple people decide upon the distribution of resources (e. g. social attention, or parts needed for an assembly)?
no code implementations • 3 Feb 2019 • Yifang Chen, Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei
We propose the first contextual bandit algorithm that is parameter-free, efficient, and optimal in terms of dynamic regret.