Search Results for author: Eric Zelikman

Found 16 papers, 10 papers with code

Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency

no code implementations10 Oct 2023 Eric Zelikman, Wanjing Anya Ma, Jasmine E. Tran, Diyi Yang, Jason D. Yeatman, Nick Haber

Developing an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses.

Language Modelling Sentence +1

ContextRef: Evaluating Referenceless Metrics For Image Description Generation

1 code implementation21 Sep 2023 Elisa Kreiss, Eric Zelikman, Christopher Potts, Nick Haber

None of the methods is successful with ContextRef, but we show that careful fine-tuning yields substantial improvements.

Hypothesis Search: Inductive Reasoning with Language Models

no code implementations11 Sep 2023 Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman

Because of the prohibitive cost of generation with state-of-the-art LLMs, we consider a middle step to filter the set of hypotheses that will be implemented into programs: we either ask the LLM to summarize into a smaller set of hypotheses, or ask human annotators to select a subset of the hypotheses.

In-Context Learning

SkyGPT: Probabilistic Short-term Solar Forecasting Using Synthetic Sky Videos from Physics-constrained VideoGPT

1 code implementation20 Jun 2023 Yuhao Nie, Eric Zelikman, Andea Scott, Quentin Paletta, Adam Brandt

Furthermore, we feed the generated future sky images from the video prediction models for 15-minute-ahead probabilistic solar forecasting for a 30-kW roof-top PV system, and compare it with an end-to-end deep learning baseline model SUNSET and a smart persistence model.

Video Prediction

Certified Deductive Reasoning with Language Models

no code implementations6 Jun 2023 Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman

In experiments on PrOntoQA, ProofWriter and Syllogism Validity datasets, \textsc{LogicGuide} significantly improves the performance of GPT-3, GPT-3. 5 Turbo and LLaMA (accuracy gains up to 35\%), while drastically reducing \emph{content effects} -- the interference between unwanted prior assumptions and reasoning, which humans and language models suffer from.

Logical Reasoning valid

Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions

1 code implementation20 Dec 2022 Eric Zelikman, Qian Huang, Gabriel Poesia, Noah D. Goodman, Nick Haber

Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.

Automated Theorem Proving Code Generation +4

STaR: Bootstrapping Reasoning With Reasoning

1 code implementation28 Mar 2022 Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman

We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30$\times$ larger state-of-the-art language model on CommensenseQA.

Common Sense Reasoning Language Modelling +1

Evaluating the Disentanglement of Deep Generative Models through Manifold Topology

1 code implementation ICLR 2021 Sharon Zhou, Eric Zelikman, Fred Lu, Andrew Y. Ng, Gunnar Carlsson, Stefano Ermon

Learning disentangled representations is regarded as a fundamental task for improving the generalization, robustness, and interpretability of generative models.


CRUDE: Calibrating Regression Uncertainty Distributions Empirically

no code implementations26 May 2020 Eric Zelikman, Christopher Healy, Sharon Zhou, Anand Avati

Calibrated uncertainty estimates in machine learning are crucial to many fields such as autonomous vehicles, medicine, and weather and climate forecasting.

Autonomous Vehicles General Classification +1

Learning as Reinforcement: Applying Principles of Neuroscience for More General Reinforcement Learning Agents

no code implementations20 Apr 2020 Eric Zelikman, William Yin, Kenneth Wang

A significant challenge in developing AI that can generalize well is designing agents that learn about their world without being told what to learn, and apply that learning to challenges with sparse rewards.

Decision Making General Reinforcement Learning +2

Contextual Salience for Fast and Accurate Sentence Vectors

1 code implementation22 Mar 2018 Eric Zelikman, Richard Socher

We introduce contextual salience (CoSal), a measure of word importance that uses the distribution of context vectors to normalize distances and weights.

Document Summarization General Classification +4

Cannot find the paper you are looking for? You can Submit a new open access paper.