Search Results for author: Alex Gu

Found 16 papers, 9 papers with code

Mixture of Parrots: Experts improve memorization more than reasoning

no code implementations24 Oct 2024 Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach

On the other hand, we find that on memory-intensive tasks, MoEs can effectively leverage a small number of active parameters with a large number of experts to memorize the data.

Math Memorization

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

2 code implementations22 Jun 2024 Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu, Zijian Wang, David Lo, Binyuan Hui, Niklas Muennighoff, Daniel Fried, Xiaoning Du, Harm de Vries, Leandro von Werra

Fulfilling both of these characteristics can pose a great challenge for LLMs. To assess how well LLMs can solve challenging and practical tasks via programs, we introduce BigCodeBench, a benchmark that challenges LLMs to invoke multiple function calls as tools from 139 libraries and 7 domains for 1, 140 fine-grained tasks.

Benchmarking Code Generation

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

no code implementations12 Mar 2024 Naman jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry.

Code Generation HumanEval

The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

no code implementations29 Feb 2024 Alex Gu, Wen-Ding Li, Naman jain, Theo X. Olausson, Celine Lee, Koushik Sen, Armando Solar-Lezama

In this work, we focus on these counterfeit samples: programs sampled from a language model that 1) have a high enough log-probability to be generated at a moderate temperature and 2) pass weak correctness checks.

Code Generation Language Modelling

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

no code implementations5 Jan 2024 Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang

The best setup, GPT-4 with chain of thought (CoT), achieves a pass@1 of 75% and 81% on input and output prediction, respectively.

HumanEval

Language Agnostic Code Embeddings

no code implementations25 Oct 2023 Saiteja Utpala, Alex Gu, Pin Yu Chen

Recently, code language models have achieved notable advancements in addressing a diverse array of essential code comprehension and generation tasks.

Retrieval

LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

1 code implementation23 Oct 2023 Theo X. Olausson, Alex Gu, Benjamin Lipkin, Cedegao E. Zhang, Armando Solar-Lezama, Joshua B. Tenenbaum, Roger Levy

Logical reasoning, i. e., deductively inferring the truth value of a conclusion from a set of premises, is an important task for artificial intelligence with wide potential impacts on science, mathematics, and society.

Logical Reasoning

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

3 code implementations NeurIPS 2023 Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library.

Automated Theorem Proving Math +1

StarCoder: may the source be with you!

4 code implementations9 May 2023 Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.

8k Code Generation +1

Certified Interpretability Robustness for Class Activation Mapping

no code implementations26 Jan 2023 Alex Gu, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, Luca Daniel

Interpreting machine learning models is challenging but crucial for ensuring the safety of deep networks in autonomous driving systems.

Autonomous Driving

ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications

no code implementations20 Oct 2022 Alex Gu, Tamara Mitrovska, Daniela Velez, Jacob Andreas, Armando Solar-Lezama

We introduce ObSynth, an interactive system leveraging the domain knowledge embedded in large language models (LLMs) to help users design object models from high level natural language prompts.

Object

Min-Max Bilevel Multi-objective Optimization with Applications in Machine Learning

1 code implementation3 Mar 2022 Alex Gu, Songtao Lu, Parikshit Ram, Lily Weng

We consider a generic min-max multi-objective bilevel optimization problem with applications in robust machine learning such as representation learning and hyperparameter optimization.

BIG-bench Machine Learning Bilevel Optimization +4

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective

1 code implementation3 Feb 2022 Satyapriya Krishna, Tessa Han, Alex Gu, Steven Wu, Shahin Jabbari, Himabindu Lakkaraju

In addition, we carry out an online user study with data scientists to understand how they resolve the aforementioned disagreements.

BIG-bench Machine Learning

Reproducibility Report: La-MAML: Look-ahead Meta Learning for Continual Learning

1 code implementation11 Feb 2021 Joel Joseph, Alex Gu

The Continual Learning (CL) problem involves performing well on a sequence of tasks under limited compute.

Continual Learning Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.