1 code implementation • 23 Jan 2024 • Mirac Suzgun, Adam Tauman Kalai
This collaborative prompting approach empowers a single LM to simultaneously act as a comprehensive orchestrator and a panel of diverse experts, significantly enhancing its performance across a wide array of tasks.
no code implementations • 24 Nov 2023 • Adam Tauman Kalai, Santosh S. Vempala
For "arbitrary" facts whose veracity cannot be determined from the training data, we show that hallucinations must occur at a certain rate for language models that satisfy a statistical calibration condition appropriate for generative language models.
no code implementations • 17 Nov 2023 • Silen Naihin, David Atkinson, Marc Green, Merwane Hamadi, Craig Swift, Douglas Schonholtz, Adam Tauman Kalai, David Bau
A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild.
1 code implementation • 3 Oct 2023 • Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai
In this work, we use a language-model-infused scaffolding program to improve itself.
no code implementations • 20 Jun 2023 • Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li
Despite this small scale, phi-1 attains pass@1 accuracy 50. 6% on HumanEval and 55. 5% on MBPP.
Ranked #41 on Code Generation on HumanEval
no code implementations • 29 May 2023 • Ayush Agrawal, Mirac Suzgun, Lester Mackey, Adam Tauman Kalai
In this work, we focus on hallucinated book and article references and present them as the "model organism" of language model hallucination research, due to their frequent and easy-to-discern nature.
no code implementations • 19 Apr 2023 • Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, Preetum Nakkiran
We show that minimizing the squared loss over all neural nets of size $n$ implies multicalibration for all but a bounded number of unlucky values of $n$.
no code implementations • 1 Sep 2022 • Surbhi Goel, Sham Kakade, Adam Tauman Kalai, Cyril Zhang
For example, on parity problems, the NN learns as well as Gaussian elimination, an efficient algorithm that can be succinctly described.
no code implementations • NeurIPS 2023 • Elad Hazan, Adam Tauman Kalai, Varun Kanade, Clara Mohri, Y. Jennifer Sun
This work establishes a new framework of partial matrix completion, where the goal is to identify a large subset of the entries that can be completed with high confidence.
2 code implementations • 18 Aug 2022 • Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what extent a given language model, such as GPT models, can simulate different aspects of human behavior.
1 code implementation • 29 Jul 2022 • Patrick Haluptzok, Matthew Bowers, Adam Tauman Kalai
We show that it is possible for an LM to synthesize programming problems and solutions, which are filtered for correctness by a Python interpreter.
no code implementations • 19 May 2022 • David Alvarez-Melis, Vikas Garg, Adam Tauman Kalai
We show that, while it may seem that maximizing likelihood is inherently different than minimizing distinguishability, this distinction is largely artificial and only holds for limited models.
no code implementations • 11 Sep 2021 • Parikshit Gopalan, Adam Tauman Kalai, Omer Reingold, Vatsal Sharan, Udi Wieder
We suggest a rigorous new paradigm for loss minimization in machine learning where the loss function can be ignored at the time of learning and only be taken into account when deciding an action.
no code implementations • 25 Aug 2021 • Myra Cheng, Maria De-Arteaga, Lester Mackey, Adam Tauman Kalai
Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race.
3 code implementations • 10 Jun 2021 • Tal Schuster, Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai
The dataset is comprehensive in that it spans problems of a range of difficulties and domains, ranging from trivial string manipulation problems, to classic programming puzzles (e. g., Tower of Hanoi), to interview/competitive-programming problems (e. g., dynamic programming), to longstanding open problems in algorithms and mathematics (e. g., factoring).
no code implementations • NeurIPS 2021 • Adam Tauman Kalai, Varun Kanade
Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser (2020) for transductive binary classification.
no code implementations • NeurIPS 2020 • Shafi Goldwasser, Adam Tauman Kalai, Yael Tauman Kalai, Omar Montasser
We present a transductive learning algorithm that takes as input training examples from a distribution $P$ and arbitrary (unlabeled) test examples, possibly chosen by an adversary.
no code implementations • 25 Sep 2019 • Ashwin Kalyan, Oleksandr Polozov, Adam Tauman Kalai
Puzzles are objective in that one can easily test the correctness of a given solution x by seeing whether it satisfies f, unlike the most common representations for program synthesis: given input-output pairs or an English problem description, the correctness of a given solution is not determined and is debatable.
no code implementations • 26 Apr 2019 • Daniel Alabi, Adam Tauman Kalai, Katrina Ligett, Cameron Musco, Christos Tzamos, Ellen Vitercik
We present an algorithm that learns to maximally prune the search space on repeated computations, thereby reducing runtime while provably outputting the correct solution each period with high probability.
no code implementations • NAACL 2019 • Alexey Romanov, Maria De-Arteaga, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, Adam Tauman Kalai
In the context of mitigating bias in occupation classification, we propose a method for discouraging correlation between the predicted probability of an individual's true occupation and a word embedding of their name.
2 code implementations • 8 Feb 2019 • Limor Gultchin, Genevieve Patterson, Nancy Baym, Nathaniel Swinger, Adam Tauman Kalai
While humor is often thought to be beyond the reach of Natural Language Processing, we show that several aspects of single-word humor correlate with simple linear directions in Word Embeddings.
4 code implementations • 27 Jan 2019 • Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Adam Tauman Kalai
We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives.
no code implementations • 20 Dec 2018 • Nathaniel Swinger, Maria De-Arteaga, Neil Thomas Heffernan IV, Mark DM Leiserson, Adam Tauman Kalai
The inputs to our algorithm are a list of target tokens, e. g. names, and a word embedding.
no code implementations • 11 Apr 2018 • Daniel Alabi, Nicole Immorlica, Adam Tauman Kalai
Most systems and learning algorithms optimize average performance or average loss -- one reason being computational complexity.
no code implementations • 25 Sep 2017 • Konstantina Christakopoulou, Adam Tauman Kalai
Our results show that (i) performing 4 rounds of our framework typically solves about 70% of the target problems, (ii) our framework can improve itself even in domain agnostic scenarios, and (iii) it can solve problems that would be otherwise too slow to solve with brute-force search.
no code implementations • 20 Jul 2017 • Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, Max Leiserson
When it is ethical and legal to use a sensitive attribute (such as gender or race) in machine learning systems, the question remains how to do so.
no code implementations • 29 Dec 2016 • Vikas K. Garg, Adam Tauman Kalai
We introduce a new paradigm to investigate unsupervised learning, reducing unsupervised learning to supervised learning.
no code implementations • 31 Mar 2015 • James Y. Zou, Kamalika Chaudhuri, Adam Tauman Kalai
In addition we also ask the crowd to provide binary labels to the remaining examples based on the discovered features.