Search Results for author: Alex Gu

Found 14 papers, 6 papers with code

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

no code implementations12 Mar 2024 Naman jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry.

Code Generation

The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

no code implementations29 Feb 2024 Alex Gu, Wen-Ding Li, Naman jain, Theo X. Olausson, Celine Lee, Koushik Sen, Armando Solar-Lezama

In this work, we focus on these counterfeit samples: programs sampled from a language model that 1) have a high enough log-probability to be generated at a moderate temperature and 2) pass weak correctness checks.

Code Generation Language Modelling

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

no code implementations5 Jan 2024 Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang

The best setup, GPT-4 with chain of thought (CoT), achieves a pass@1 of 75% and 81% on input and output prediction, respectively.

Language Agnostic Code Embeddings

no code implementations25 Oct 2023 Saiteja Utpala, Alex Gu, Pin Yu Chen

Recently, code language models have achieved notable advancements in addressing a diverse array of essential code comprehension and generation tasks.

Retrieval

LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

1 code implementation23 Oct 2023 Theo X. Olausson, Alex Gu, Benjamin Lipkin, Cedegao E. Zhang, Armando Solar-Lezama, Joshua B. Tenenbaum, Roger Levy

Logical reasoning, i. e., deductively inferring the truth value of a conclusion from a set of premises, is an important task for artificial intelligence with wide potential impacts on science, mathematics, and society.

Logical Reasoning

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

3 code implementations NeurIPS 2023 Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library.

Automated Theorem Proving Math +1

Certified Interpretability Robustness for Class Activation Mapping

no code implementations26 Jan 2023 Alex Gu, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, Luca Daniel

Interpreting machine learning models is challenging but crucial for ensuring the safety of deep networks in autonomous driving systems.

Autonomous Driving

ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications

no code implementations20 Oct 2022 Alex Gu, Tamara Mitrovska, Daniela Velez, Jacob Andreas, Armando Solar-Lezama

We introduce ObSynth, an interactive system leveraging the domain knowledge embedded in large language models (LLMs) to help users design object models from high level natural language prompts.

Object

Min-Max Bilevel Multi-objective Optimization with Applications in Machine Learning

1 code implementation3 Mar 2022 Alex Gu, Songtao Lu, Parikshit Ram, Lily Weng

We consider a generic min-max multi-objective bilevel optimization problem with applications in robust machine learning such as representation learning and hyperparameter optimization.

BIG-bench Machine Learning Bilevel Optimization +4

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective

no code implementations3 Feb 2022 Satyapriya Krishna, Tessa Han, Alex Gu, Javin Pombra, Shahin Jabbari, Steven Wu, Himabindu Lakkaraju

To this end, we first conduct interviews with data scientists to understand what constitutes disagreement between explanations generated by different methods for the same model prediction, and introduce a novel quantitative framework to formalize this understanding.

BIG-bench Machine Learning

Reproducibility Report: La-MAML: Look-ahead Meta Learning for Continual Learning

1 code implementation11 Feb 2021 Joel Joseph, Alex Gu

The Continual Learning (CL) problem involves performing well on a sequence of tasks under limited compute.

Continual Learning Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.