no code implementations • 18 Oct 2024 • Alberto Del Pia, Dekun Zhou, Yinglun Zhu
Our framework, when integrated with this algorithm, reduces the runtime to $\mathcal{O}\left(\frac{d}{d^\star} \cdot g(k, d^\star) + d^2\right)$, where $d^\star \leq d$ is the largest block size of the block-diagonal matrix.
no code implementations • 17 Jun 2024 • Dingyang Chen, Qi Zhang, Yinglun Zhu
In this paper, we propose a new approach that leverages online model selection algorithms to efficiently incorporate LLMs agents into sequential decision making.
no code implementations • 12 Jan 2024 • Gantavya Bhatt, Yifang Chen, Arnav M. Das, Jifan Zhang, Sang T. Truong, Stephen Mussmann, Yinglun Zhu, Jeffrey Bilmes, Simon S. Du, Kevin Jamieson, Jordan T. Ash, Robert D. Nowak
To mitigate the annotation cost of SFT and circumvent the computational bottlenecks of active learning, we propose using experimental design.
1 code implementation • 16 Jun 2023 • Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Arnav M. Das, Gantavya Bhatt, Yinglun Zhu, Jeffrey Bilmes, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak
Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive.
1 code implementation • 16 Feb 2023 • Mark Rucker, Yinglun Zhu, Paul Mineiro
For infinite action contextual bandits, smoothed regret and reduction to regression results in state-of-the-art online performance with computational cost independent of the action set: unfortunately, the resulting data exhaust does not have well-defined importance-weights.
no code implementations • 15 Oct 2022 • Yinglun Zhu, Robert Nowak
Deep neural networks have great representation power, but typically require large numbers of training examples.
1 code implementation • 12 Jul 2022 • Yinglun Zhu, Paul Mineiro
Designing efficient general-purpose contextual bandit algorithms that work with large -- or even continuous -- action spaces would facilitate application to important scenarios such as information retrieval, recommendation systems, and continuous control.
1 code implementation • 12 Jul 2022 • Yinglun Zhu, Dylan J. Foster, John Langford, Paul Mineiro
Focusing on the contextual bandit problem, recent progress provides provably efficient algorithms with strong empirical performance when the number of possible alternatives ("actions") is small, but guarantees for decision making in large, continuous action spaces have remained elusive, leading to a significant gap between theory and practice.
no code implementations • 31 Mar 2022 • Yinglun Zhu, Robert Nowak
Furthermore, our algorithm is guaranteed to only abstain on hard examples (where the true label distribution is close to a fair coin), a novel property we term \emph{proper abstention} that also leads to a host of other desirable characteristics (e. g., recovering minimax guarantees in the standard setting, and avoiding the undesirable ``noise-seeking'' behavior often seen in active learning).
no code implementations • 10 Sep 2021 • Yinglun Zhu, Julian Katz-Samuels, Robert Nowak
The core of our algorithms is a new optimization problem based on experimental design that leverages the geometry of the action set to identify a near-optimal hypothesis class.
no code implementations • NeurIPS 2021 • Yinglun Zhu, Dongruo Zhou, Ruoxi Jiang, Quanquan Gu, Rebecca Willett, Robert Nowak
To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space and carefully deal with the induced model misspecification.
no code implementations • 12 Feb 2021 • Yinglun Zhu, Robert Nowak
In this paper, we establish the first lower bound for the model selection problem.
1 code implementation • ICML 2020 • Yinglun Zhu, Sumeet Katariya, Robert Nowak
We study the problem of Robust Outlier Arm Identification (ROAI), where the goal is to identify arms whose expected rewards deviate substantially from the majority, by adaptively sampling from their reward distributions.
no code implementations • NeurIPS 2020 • Yinglun Zhu, Robert Nowak
With additional knowledge of the expected reward of the best arm, we propose another adaptive algorithm that is minimax optimal, up to polylog factors, over all hardness levels.
no code implementations • 21 Dec 2017 • Jiefeng Chen, Zihang Meng, Changtian Sun, Wei Tang, Yinglun Zhu
Though deep neural network has hit a huge success in recent studies and applica- tions, it still remains vulnerable to adversarial perturbations which are imperceptible to humans.