no code implementations • 12 Dec 2024 • Marah Abdin, Jyoti Aneja, Harkirat Behl, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J. Hewett, Mojan Javaheripi, Piero Kauffmann, James R. Lee, Yin Tat Lee, Yuanzhi Li, Weishung Liu, Caio C. T. Mendes, Anh Nguyen, Eric Price, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Xin Wang, Rachel Ward, Yue Wu, Dingli Yu, Cyril Zhang, Yi Zhang
We present phi-4, a 14-billion parameter language model developed with a training recipe that is centrally focused on data quality.
no code implementations • 2 Dec 2024 • Audrey Huang, Adam Block, Dylan J. Foster, Dhruv Rohatgi, Cyril Zhang, Max Simchowitz, Jordan T. Ash, Akshay Krishnamurthy
Recent work in language modeling has raised the possibility of self-improvement, where a language models evaluates and refines its own generations to achieve higher performance without external feedback.
no code implementations • 8 Nov 2024 • Kayo Yin, Chinmay Singh, Fyodor O. Minakov, Vanessa Milan, Hal Daumé III, Cyril Zhang, Alex X. Lu, Danielle Bragg
Deaf and hard-of-hearing (DHH) students face significant barriers in accessing science, technology, engineering, and mathematics (STEM) education, notably due to the scarcity of STEM resources in signed languages.
no code implementations • 22 Apr 2024 • Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai, Matthew Dixon, Ronen Eldan, Victor Fragoso, Jianfeng Gao, Mei Gao, Min Gao, Amit Garg, Allie Del Giorno, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Wenxiang Hu, Jamie Huynh, Dan Iter, Sam Ade Jacobs, Mojan Javaheripi, Xin Jin, Nikos Karampatziakis, Piero Kauffmann, Mahoud Khademi, Dongwoo Kim, Young Jin Kim, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Xihui Lin, Zeqi Lin, Ce Liu, Liyuan Liu, Mengchen Liu, Weishung Liu, Xiaodong Liu, Chong Luo, Piyush Madan, Ali Mahmoudzadeh, David Majercak, Matt Mazzola, Caio César Teodoro Mendes, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Liliang Ren, Gustavo de Rosa, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Yelong Shen, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Praneetha Vaddamanu, Chunyu Wang, Guanhua Wang, Lijuan Wang, Shuohang Wang, Xin Wang, Yu Wang, Rachel Ward, Wen Wen, Philipp Witte, Haiping Wu, Xiaoxia Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Jilong Xue, Sonali Yadav, Fan Yang, Jianwei Yang, Yifan Yang, ZiYi Yang, Donghan Yu, Lu Yuan, Chenruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou
We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.
Ranked #5 on MMR total on MRR-Benchmark (using extra training data)
no code implementations • 22 Mar 2024 • Akshay Krishnamurthy, Keegan Harris, Dylan J. Foster, Cyril Zhang, Aleksandrs Slivkins
We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.
no code implementations • 17 Oct 2023 • Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang
This work studies training instabilities of behavior cloning with deep neural networks.
no code implementations • 7 Sep 2023 • Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang
Finally, we show that the synthetic sparse parity task can be useful as a proxy for real problems requiring axis-aligned feature learning.
no code implementations • 28 Feb 2023 • Sham M. Kakade, Akshay Krishnamurthy, Gaurav Mahajan, Cyril Zhang
In this paper, we depart from this setup and consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.
1 code implementation • 2 Nov 2022 • Savya Khosla, Chew Kin Whye, Jordan T. Ash, Cyril Zhang, Kenji Kawaguchi, Alex Lamb
To this end, we demonstrate the catastrophic failure of these active learning algorithms on heteroskedastic distributions and propose a fine-tuning-based approach to mitigate these failures.
no code implementations • 19 Oct 2022 • Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine.
no code implementations • 1 Sep 2022 • Surbhi Goel, Sham Kakade, Adam Tauman Kalai, Cyril Zhang
For example, on parity problems, the NN learns as well as Gaussian elimination, an efficient algorithm that can be succinctly described.
no code implementations • 18 Jul 2022 • Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang
There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times.
no code implementations • 28 Feb 2022 • Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy
Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.
no code implementations • 19 Nov 2021 • Daniel Suo, Cyril Zhang, Paula Gradu, Udaya Ghai, Xinyi Chen, Edgar Minasyan, Naman Agarwal, Karan Singh, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan
Mechanical ventilation is one of the most widely used therapies in the ICU.
no code implementations • ICLR 2022 • Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade
Intrinsic rewards play a central role in handling the exploration-exploitation trade-off when designing sequential decision-making algorithms, in both foundational theory and state-of-the-art deep reinforcement learning.
no code implementations • 19 Oct 2021 • Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Cyril Zhang
Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond.
no code implementations • 12 Oct 2021 • Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang
However, in practice, we often encounter systems in which a large set of state variables evolve exogenously and independently of the control inputs; such systems are only partially controllable.
no code implementations • 29 Sep 2021 • Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang
In the empirical science of training large neural networks, the learning rate schedule is a notoriously challenging-to-tune hyperparameter, which can depend on all other properties (architecture, optimizer, batch size, dataset, regularization, ...) of the problem.
no code implementations • 1 Mar 2021 • Naman Agarwal, Surbhi Goel, Cyril Zhang
In practical applications of iterative first-order optimization, the learning rate schedule remains notoriously difficult to understand and expensive to tune.
1 code implementation • 19 Feb 2021 • Paula Gradu, John Hallman, Daniel Suo, Alex Yu, Naman Agarwal, Udaya Ghai, Karan Singh, Cyril Zhang, Anirudha Majumdar, Elad Hazan
We present an open-source library of natively differentiable physics and robotics environments, accompanied by gradient-based control methods and a benchmark-ing suite.
2 code implementations • 12 Feb 2021 • Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alexander Yu, Paula Gradu, Karan Singh, Cyril Zhang, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan
We consider the problem of controlling an invasive mechanical ventilator for pressure-controlled ventilation: a controller must let air in and out of a sedated patient's lungs according to a trajectory of airway pressures specified by a clinician.
no code implementations • NeurIPS 2020 • Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar, Cyril Zhang
State-of-the-art optimization is steadily shifting towards massively parallel pipelines with extremely large batch sizes.
1 code implementation • 26 Feb 2020 • Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang
We investigate several confounding factors in the evaluation of optimization algorithms for deep learning.
no code implementations • 6 Feb 2020 • Udaya Ghai, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
This requires a refined regret analysis, including a structural lemma showing the current state of the system to be a small linear combination of past states, even if the state grows polynomially.
no code implementations • ICLR 2020 • Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang
A commonplace belief in the machine learning community is that using adaptive gradient methods hurts generalization.
no code implementations • ICML 2020 • Mark Braverman, Xinyi Chen, Sham M. Kakade, Karthik Narasimhan, Cyril Zhang, Yi Zhang
Building accurate language models that capture meaningful long-term dependencies is a core challenge in natural language processing.
no code implementations • 23 May 2019 • Holden Lee, Cyril Zhang
The optimal predictor for a linear dynamical system (with hidden state and Gaussian noise) takes the form of an autoregressive linear filter, namely the Kalman filter.
no code implementations • ICLR 2020 • Xinyi Chen, Naman Agarwal, Elad Hazan, Cyril Zhang, Yi Zhang
State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption.
no code implementations • ICLR 2019 • Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, Cyril Zhang, Yi Zhang
Due to the large number of parameters of machine learning problems, full-matrix preconditioning methods are prohibitively expensive.
no code implementations • NeurIPS 2018 • Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix.
no code implementations • ICLR 2018 • Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state.
1 code implementation • NeurIPS 2017 • Elad Hazan, Karan Singh, Cyril Zhang
We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems with a symmetric transition matrix.
1 code implementation • ICLR 2018 • Brian Bullins, Cyril Zhang, Yi Zhang
We propose a principled method for kernel learning, which relies on a Fourier-analytic characterization of translation-invariant or rotation-invariant kernels.
no code implementations • ICML 2017 • Elad Hazan, Karan Singh, Cyril Zhang
We consider regret minimization in repeated games with non-convex loss functions.