Search Results for author: Yifang Chen

Found 25 papers, 4 papers with code

Limits of KV Cache Compression for Tensor Attention based Autoregressive Transformers

no code implementations14 Mar 2025 Yifang Chen, Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song, Yu Tian

The key-value (KV) cache in autoregressive transformers presents a significant bottleneck during inference, which restricts the context length capabilities of large language models (LLMs).

Fast Gradient Computation for RoPE Attention in Almost Linear Time

no code implementations23 Dec 2024 Yifang Chen, Jiayan Huo, Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song

The Rotary Position Embedding (RoPE) mechanism has become a powerful enhancement to the Transformer architecture, which enables models to capture token relationships when encoding positional information.

The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity

no code implementations9 Dec 2024 Yifang Chen, Xiaoyu Li, YIngyu Liang, Zhenmei Shi, Zhao Song

In this paper, we analyze the computational limitations of Mamba and State-space Models (SSMs) by using the circuit complexity framework.

Mamba State Space Models

Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation

no code implementations27 Oct 2024 Yifang Chen, David Zhu, Simon Du, Kevin Jamieson, Yang Liu

Recent advances in large language model (LLM) training have highlighted the need for diverse, high-quality instruction data.

GSM8K Language Modeling +6

I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models Through 3D Reconstruction

no code implementations19 Jul 2024 Zaiqiao Meng, Hao Zhou, Yifang Chen

Visual Language Models (VLMs) are essential for various tasks, particularly visual reasoning tasks, due to their robust multi-modal information integration, visual reasoning capabilities, and contextual awareness.

3D Reconstruction Spatial Reasoning +1

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning

no code implementations2 Jul 2024 Yifang Chen, Shuohang Wang, ZiYi Yang, Hiteshi Sharma, Nikos Karampatziakis, Donghan Yu, Kevin Jamieson, Simon Shaolei Du, Yelong Shen

Reinforcement learning with human feedback (RLHF), as a widely adopted approach in current large language model pipelines, is \textit{bottlenecked by the size of human preference data}.

Active Learning Language Modelling +2

Decoding-Time Language Model Alignment with Multiple Objectives

1 code implementation27 Jun 2024 Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon S. Du

Unlike traditional methods that require careful curation of a mixture of datasets to achieve comprehensive improvement, we can quickly experiment with preference weightings using MOD to find the best combination of models.

Language Modeling Language Modelling

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning

2 code implementations29 May 2024 Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du

Three main data selection approaches are: (1) leveraging external non-CLIP models to aid data selection, (2) training new CLIP-style embedding models that are more effective at selecting high-quality data than the original OpenAI CLIP model, and (3) designing better metrics or strategies universally applicable to any CLIP embedding without requiring specific model properties (e. g., CLIPScore is one popular metric).

Contrastive Learning Language Modelling

Variance Alignment Score: A Simple But Tough-to-Beat Data Selection Method for Multimodal Contrastive Learning

2 code implementations3 Feb 2024 Yiping Wang, Yifang Chen, Wendan Yan, Kevin Jamieson, Simon Shaolei Du

In recent years, data selection has emerged as a core issue for large-scale visual-language model pretraining, especially on noisy web-curated datasets.

Contrastive Learning Experimental Design +1

Improved Active Multi-Task Representation Learning via Lasso

no code implementations5 Jun 2023 Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du

In addition to our sample complexity results, we also characterize the potential of our $\nu^1$-based strategy in sample-cost-sensitive settings.

Representation Learning

Causal Bandits: Online Decision-Making in Endogenous Settings

no code implementations16 Nov 2022 Jingwen Zhang, Yifang Chen, Amandeep Singh

To this end, in this paper, we consider the problem of online learning in linear stochastic contextual bandit problems with endogenous covariates.

Decision Making Multi-Armed Bandits

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

no code implementations4 Nov 2022 Yifang Chen, Karthik Sankararaman, Alessandro Lazaric, Matteo Pirotta, Dmytro Karamshuk, Qifan Wang, Karishma Mandyam, Sinong Wang, Han Fang

We design a novel algorithmic template, Weak Labeler Active Cover (WL-AC), that is able to robustly leverage the lower quality weak labelers to reduce the query complexity while retaining the desired level of accuracy.

Active Learning

A Deep Bayesian Bandits Approach for Anticancer Therapy: Exploration via Functional Prior

no code implementations5 May 2022 Mingyu Lu, Yifang Chen, Su-In Lee

Learning personalized cancer treatment with machine learning holds great promise to improve cancer patients' chance of survival.

BIG-bench Machine Learning Drug Response Prediction

Active Multi-Task Representation Learning

no code implementations2 Feb 2022 Yifang Chen, Simon S. Du, Kevin Jamieson

To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications.

Active Learning Multi-Task Learning +1

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

no code implementations26 Jan 2022 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/\epsilon^2)$.

Reinforcement Learning (RL)

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

no code implementations7 Dec 2021 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.

Decision Making reinforcement-learning +3

Corruption Robust Active Learning

no code implementations NeurIPS 2021 Yifang Chen, Simon S. Du, Kevin Jamieson

We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions.

Active Learning Binary Classification

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

no code implementations13 Feb 2021 Yifang Chen, Simon S. Du, Kevin Jamieson

We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system.

reinforcement-learning Reinforcement Learning +1

More Practical and Adaptive Algorithms for Online Quantum State Learning

no code implementations1 Jun 2020 Yifang Chen, Xin Wang

This regret bound depends only on the maximum rank $M$ of measurements rather than the number of qubits, which takes advantage of low-rank measurements.

Fair Contextual Multi-Armed Bandits: Theory and Experiments

no code implementations13 Dec 2019 Yifang Chen, Alex Cuellar, Haipeng Luo, Jignesh Modi, Heramb Nemlekar, Stefanos Nikolaidis

We introduce a Multi-Armed Bandit algorithm with fairness constraints, where fairness is defined as a minimum rate that a task or a resource is assigned to a user.

Decision Making Fairness +1

Multi-Armed Bandits with Fairness Constraints for Distributing Resources to Human Teammates

no code implementations30 Jun 2019 Houston Claure, Yifang Chen, Jignesh Modi, Malte Jung, Stefanos Nikolaidis

How should a robot that collaborates with multiple people decide upon the distribution of resources (e. g. social attention, or parts needed for an assembly)?

Fairness Multi-Armed Bandits +1

A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free

no code implementations3 Feb 2019 Yifang Chen, Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei

We propose the first contextual bandit algorithm that is parameter-free, efficient, and optimal in terms of dynamic regret.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.