Search Results for author: Michael Santacroce

Found 6 papers, 1 papers with code

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations22 Apr 2024 Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, ZiYi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

no code implementations4 Apr 2024 Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie

In this paper, we introduce Direct Nash Optimization (DNO), a provable and scalable algorithm that marries the simplicity and stability of contrastive learning with theoretical generality from optimizing general preferences.

Contrastive Learning

Adapting LLM Agents with Universal Feedback in Communication

no code implementations1 Oct 2023 Kuan Wang, Yadong Lu, Michael Santacroce, Yeyun Gong, Chao Zhang, Yelong Shen

To optimize agent interactions for task-specific learning with our universal buffer and pipeline, we introduce diverse communication patterns tailored for both single-agent and multi-agent environments.

Decision Making GSM8K

Efficient RLHF: Reducing the Memory Usage of PPO

no code implementations1 Sep 2023 Michael Santacroce, Yadong Lu, Han Yu, Yuanzhi Li, Yelong Shen

To address this issue, we present a comprehensive analysis the memory usage, performance, and training time of memory-savings techniques for PPO.

Language Modelling

What Matters In The Structured Pruning of Generative Language Models?

1 code implementation7 Feb 2023 Michael Santacroce, Zixin Wen, Yelong Shen, Yuanzhi Li

Auto-regressive large language models such as GPT-3 require enormous computational resources to use.

Text Generation

Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features

no code implementations ICLR 2022 Greg Yang, Michael Santacroce, Edward J Hu

While a popular limit of infinite-width neural networks, the Neural Tangent Kernel (NTK) often exhibits performance gaps from finite-width neural networks on standard datasets, due to lack of feature learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.