Search Results for author: Ruiqi Zhang

Found 18 papers, 5 papers with code

Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes

no code implementations5 Apr 2025 Ruiqi Zhang, Jingfeng Wu, Licong Lin, Peter L. Bartlett

We show that after at most $1/\gamma^2$ burn-in steps, GD achieves a risk upper bounded by $\exp(-\Theta(\eta))$, where $\gamma$ is the margin of the dataset.

Fast and Physically-based Neural Explicit Surface for Relightable Human Avatars

no code implementations24 Mar 2025 Jiacheng Wu, Ruiqi Zhang, Jie Chen, HUI ZHANG

Efficiently modeling relightable human avatars from sparse-view videos is crucial for AR/VR applications.

Disentanglement

How Do LLMs Perform Two-Hop Reasoning in Context?

no code implementations19 Feb 2025 Tianyu Guo, Hanlin Zhu, Ruiqi Zhang, Jiantao Jiao, Song Mei, Michael I. Jordan, Stuart Russell

We further propose a three-parameter model that supports the causal claims for the mechanisms to the training dynamics of the transformer.

Predicting Organic-Inorganic Halide Perovskite Photovoltaic Performance from Optical Properties of Constituent Films through Machine Learning

no code implementations6 Dec 2024 Ruiqi Zhang, Brandon Motes, Shaun Tan, Yongli Lu, Meng-Chen Shih, Yilun Hao, Karen Yang, Shreyas Srinivasan, Moungi G. Bawendi, Vladimir Bulovic

We demonstrate a machine learning (ML) approach that accurately predicts the current-voltage behavior of 3D/2D-structured (FAMA)Pb(IBr)3/OABr hybrid organic-inorganic halide perovskite (HOIP) solar cells under AM1. 5 illumination.

Fast Best-of-N Decoding via Speculative Rejection

1 code implementation26 Oct 2024 Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette

The safe and effective deployment of Large Language Models (LLMs) involves a critical step called alignment, which ensures that the model's responses are in accordance with human preferences.

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

2 code implementations9 Oct 2024 Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, Sijia Liu

In this work, we address the problem of large language model (LLM) unlearning, aiming to remove unwanted data influences and associated model capabilities (e. g., copyrighted data or harmful content generation) while preserving essential model utilities, without the need for retraining from scratch.

Language Modeling Language Modelling +1

Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey

2 code implementations19 Aug 2024 Ruiqi Zhang, Jing Hou, Florian Walter, Shangding Gu, Jiayi Guan, Florian Röhrbein, Yali Du, Panpan Cai, Guang Chen, Alois Knoll

Reinforcement Learning (RL) is a potent tool for sequential decision-making and has achieved performance surpassing human capabilities across many challenging real-world tasks.

Autonomous Driving Decision Making +6

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

1 code implementation8 Apr 2024 Ruiqi Zhang, Licong Lin, Yu Bai, Song Mei

LLM unlearning aims to eliminate the influence of undesirable data from the pre-trained model while preserving the model's utilities on other tasks.

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement

no code implementations24 Feb 2024 Ruiqi Zhang, Yuexiang Zhai, Andrea Zanette

Surprisingly, in this work, we demonstrate that even in such a data-starved setting it may still be possible to find a policy competitive with the optimal one.

Decision Making Multi-Armed Bandits

In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization

no code implementations22 Feb 2024 Ruiqi Zhang, Jingfeng Wu, Peter L. Bartlett

We study the \emph{in-context learning} (ICL) ability of a \emph{Linear Transformer Block} (LTB) that combines a linear attention component and a linear multi-layer perceptron (MLP) component.

In-Context Learning

AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition

no code implementations18 Feb 2024 Zhaorun Chen, Zhuokai Zhao, Zhihong Zhu, Ruiqi Zhang, Xiang Li, Bhiksha Raj, Huaxiu Yao

Recent advancements in large language models (LLMs) have shown promise in multi-step reasoning tasks, yet their reliance on extensive manual labeling to provide procedural feedback remains a significant impediment.

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

no code implementations11 Dec 2023 Jing Hou, Guang Chen, Ruiqi Zhang, Zhijun Li, Shangding Gu, Changjun Jiang

While existing parallel RL frameworks encompass a variety of RL algorithms and parallelization techniques, the excessively burdensome communication frameworks hinder the attainment of the hardware's limit for final throughput and training effects on a single desktop.

reinforcement-learning Reinforcement Learning +1

Explicifying Neural Implicit Fields for Efficient Dynamic Human Avatar Modeling via a Neural Explicit Surface

no code implementations7 Aug 2023 Ruiqi Zhang, Jie Chen, Qiang Wang

This paper proposes a technique for efficiently modeling dynamic humans by explicifying the implicit neural fields via a Neural Explicit Surface (NES).

Computational Efficiency

Trained Transformers Learn Linear Models In-Context

no code implementations16 Jun 2023 Ruiqi Zhang, Spencer Frei, Peter L. Bartlett

We show that although gradient flow succeeds at finding a global minimum in this setting, the trained transformer is still brittle under mild covariate shifts.

In-Context Learning regression

NDF: Neural Deformable Fields for Dynamic Human Modelling

1 code implementation19 Jul 2022 Ruiqi Zhang, Jie Chen

However, the learned canonical representation is static and the current design of the deformation fields is not able to represent large movements or detailed geometry changes.

Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory

no code implementations10 Feb 2022 Ruiqi Zhang, Xuezhou Zhang, Chengzhuo Ni, Mengdi Wang

We approach this problem using the Z-estimation theory and establish the following results: The FQE estimation error is asymptotically normal with explicit variance determined jointly by the tangent space of the function class at the ground truth, the reward structure, and the distribution shift due to off-policy learning; The finite-sample FQE error bound is dominated by the same variance term, and it can also be bounded by function class-dependent divergence, which measures how the off-policy distribution shift intertwines with the function approximator.

Off-policy evaluation

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

no code implementations31 Jan 2022 Chengzhuo Ni, Ruiqi Zhang, Xiang Ji, Xuezhou Zhang, Mengdi Wang

Policy gradient (PG) estimation becomes a challenge when we are not allowed to sample with the target policy but only have access to a dataset generated by some unknown behavior policy.

A paradigm system for strong correlation and charge transfer competition

no code implementations4 Mar 2021 James W Furness, Ruiqi Zhang, Jianwei Sun

In chemistry and condensed matter physics the solution of simple paradigm systems, such as the hydrogen atom and the uniform electron gas, plays a critical role in understanding electron behaviors and developing electronic structure methods.

Chemical Physics

Cannot find the paper you are looking for? You can Submit a new open access paper.