Search Results for author: Osbert Bastani

Found 71 papers, 35 papers with code

Zero-Shot Learning Through Cross-Modal Transfer

2 code implementations • NeurIPS 2013 • Richard Socher, Milind Ganjoo, Hamsa Sridhar, Osbert Bastani, Christopher D. Manning, Andrew Y. Ng

This work introduces a model that can recognize objects in images even if no training data is available for the objects.

Outlier Detection Zero-Shot Learning

Paper
Code

Measuring Neural Net Robustness with Constraints

1 code implementation • NeurIPS 2016 • Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi

Despite having high accuracy, neural nets have been shown to be susceptible to adversarial examples, where a small perturbation to an input can cause it to become mislabeled.

Paper
Code

Synthesizing Program Input Grammars

1 code implementation • 5 Aug 2016 • Osbert Bastani, Rahul Sharma, Alex Aiken, Percy Liang

We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program.

Programming Languages

Paper
Code

Interpreting Blackbox Models via Model Extraction

no code implementations • 23 May 2017 • Osbert Bastani, Carolyn Kim, Hamsa Bastani

Interpretability has become incredibly important as machine learning is increasingly used to inform consequential decisions.

Model extraction

Paper
Add Code

Interpretability via Model Extraction

no code implementations • 29 Jun 2017 • Osbert Bastani, Carolyn Kim, Hamsa Bastani

The ability to interpret machine learning models has become increasingly important now that machine learning is used to inform consequential decisions.

BIG-bench Machine Learning Model extraction +2

Paper
Add Code

Verifiable Reinforcement Learning via Policy Extraction

1 code implementation • NeurIPS 2018 • Osbert Bastani, Yewen Pu, Armando Solar-Lezama

While deep reinforcement learning has successfully solved many challenging control tasks, its real-world applicability has been limited by the inability to ensure the safety of learned policies.

Imitation Learning Model Compression +2

Paper
Code

Probabilistic Verification of Fairness Properties via Concentration

1 code implementation • 2 Dec 2018 • Osbert Bastani, Xin Zhang, Armando Solar-Lezama

As machine learning systems are increasingly used to make real world legal and financial decisions, it is of paramount importance that we develop algorithms to verify that these systems do not discriminate against minorities.

BIG-bench Machine Learning Fairness

Paper
Code

Algorithms for Fairness in Sequential Decision Making

1 code implementation • 24 Jan 2019 • Min Wen, Osbert Bastani, Ufuk Topcu

It has recently been shown that if feedback effects of decisions are ignored, then imposing fairness constraints such as demographic parity or equality of opportunity can actually exacerbate unfairness.

Decision Making Fairness

Paper
Code

Learning Interpretable Models with Causal Guarantees

no code implementations • 24 Jan 2019 • Carolyn Kim, Osbert Bastani

We propose a framework for learning interpretable models from observational data that can be used to predict individual treatment effects (ITEs).

BIG-bench Machine Learning Decision Making

Paper
Add Code

Learning Neurosymbolic Generative Models via Program Synthesis

no code implementations • ICLR Workshop drlStructPred 2019 • Halley Young, Osbert Bastani, Mayur Naik

Significant strides have been made toward designing better generative models in recent years.

Program Synthesis

Paper
Add Code

Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems

no code implementations • 24 Jan 2019 • Osbert Bastani

Reinforcement learning is a promising approach to learning robotics controllers.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Safe Reinforcement Learning with Nonlinear Dynamics via Model Predictive Shielding

1 code implementation • 25 May 2019 • Osbert Bastani

Reinforcement learning is a promising approach to synthesizing policies for challenging robotics tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Robust Model Predictive Shielding for Safe Reinforcement Learning with Stochastic Dynamics

no code implementations • 24 Oct 2019 • Shuo Li, Osbert Bastani

We build on the idea of model predictive shielding (MPS), where a backup controller is used to override the learned policy as needed to ensure safety.

Learning Theory reinforcement-learning +2

Paper
Add Code

MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding

no code implementations • 25 Oct 2019 • Wenbo Zhang, Osbert Bastani, Vijay Kumar

Reinforcement learning is a promising approach to learning control policies for performing complex multi-agent robotics tasks.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations

no code implementations • 15 Nov 2019 • Himabindu Lakkaraju, Osbert Bastani

Our work is the first to empirically establish how user trust in black box models can be manipulated via misleading explanations.

Paper
Add Code

PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction

1 code implementation • ICLR 2020 • Sangdon Park, Osbert Bastani, Nikolai Matni, Insup Lee

We propose an algorithm combining calibrated prediction and generalization bounds from learning theory to construct confidence sets for deep neural networks with PAC guarantees---i. e., the confidence set for a given input contains the true label with high probability.

Generalization Bounds Learning Theory +3

Paper
Code

Calibrated Prediction with Covariate Shift via Unsupervised Domain Adaptation

no code implementations • 29 Feb 2020 • Sangdon Park, Osbert Bastani, James Weimer, Insup Lee

Our algorithm uses importance weighting to correct for the shift from the training to the real-world distribution.

Unsupervised Domain Adaptation

Paper
Add Code

Synthesizing Programmatic Policies that Inductively Generalize

no code implementations • ICLR 2020 • Jeevana Priya Inala, Osbert Bastani, Zenna Tavares, Armando Solar-Lezama

We show that our algorithm can be used to learn policies that inductively generalize to novel environments, whereas traditional neural network policies fail to do so.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

A Composable Specification Language for Reinforcement Learning Tasks

1 code implementation • NeurIPS 2019 • Kishor Jothimurugan, Rajeev Alur, Osbert Bastani

Reinforcement learning is a promising approach for learning control policies for robot tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Abstract Value Iteration for Hierarchical Reinforcement Learning

no code implementations • 29 Oct 2020 • Kishor Jothimurugan, Osbert Bastani, Rajeev Alur

We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Add Code

PAC Confidence Predictions for Deep Neural Network Classifiers

no code implementations • ICLR 2021 • Sangdon Park, Shuo Li, Insup Lee, Osbert Bastani

In our experiments, we demonstrate that our approach can be used to provide guarantees for state-of-the-art DNNs.

Paper
Add Code

Robust and Stable Black Box Explanations

no code implementations • 12 Nov 2020 • Himabindu Lakkaraju, Nino Arsov, Osbert Bastani

To the best of our knowledge, this work makes the first attempt at generating post hoc explanations that are robust to a general class of adversarial perturbations that are of practical interest.

Paper
Add Code

Learning Models for Actionable Recourse

1 code implementation • NeurIPS 2021 • Alexis Ross, Himabindu Lakkaraju, Osbert Bastani

As machine learning models are increasingly deployed in high-stakes domains such as legal and financial decision-making, there has been growing interest in post-hoc methods for generating counterfactual explanations.

counterfactual Decision Making

Paper
Code

Likelihood-Based Diverse Sampling for Trajectory Forecasting

1 code implementation • ICCV 2021 • Yecheng Jason Ma, Jeevana Priya Inala, Dinesh Jayaraman, Osbert Bastani

We propose Likelihood-Based Diverse Sampling (LDS), a method for improving the quality and the diversity of trajectory samples from a pre-trained flow model.

Trajectory Forecasting

Paper
Code

Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints

no code implementations • 1 Jan 2021 • Halley Young, Maxwell Du, Osbert Bastani

We propose a novel approach for incorporating structure in the form of relational constraints between different subcomponents of an example (e. g., lines of a poem or measures of music).

Program Synthesis

Paper
Add Code

Neurosymbolic Transformers for Multi-Agent Communication

1 code implementation • NeurIPS 2020 • Jeevana Priya Inala, Yichen Yang, James Paulos, Yewen Pu, Osbert Bastani, Vijay Kumar, Martin Rinard, Armando Solar-Lezama

We study the problem of inferring communication structures that can solve cooperative multi-agent planning problems while minimizing the amount of communication.

Paper
Code

Program Synthesis Guided Reinforcement Learning for Partially Observed Environments

1 code implementation • NeurIPS 2021 • Yichen David Yang, Jeevana Priya Inala, Osbert Bastani, Yewen Pu, Armando Solar-Lezama, Martin Rinard

Our results demonstrate that our approach can obtain the benefits of program-guided reinforcement learning without requiring the user to provide a new guiding program for every new task.

Program Synthesis reinforcement-learning +1

Paper
Code

Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

no code implementations • 18 Apr 2021 • Kan Xu, Xuanyi Zhao, Hamsa Bastani, Osbert Bastani

However, learning word embeddings from new domains with limited training data can be challenging, because the meaning/usage may be different in the new domain, e. g., the word ``positive'' typically has positive sentiment, but often has negative sentiment in medical notes since it may imply that a patient tested positive for a disease.

Generalization Bounds Learning Word Embeddings +1

Paper
Add Code

PAC Prediction Sets Under Covariate Shift

1 code implementation • ICLR 2022 • Sangdon Park, Edgar Dobriban, Insup Lee, Osbert Bastani

Our approach focuses on the setting where there is a covariate shift from the source distribution (where we have labeled training examples) to the target distribution (for which we want to quantify uncertainty).

Uncertainty Quantification

Paper
Code

Compositional Reinforcement Learning from Logical Specifications

1 code implementation • NeurIPS 2021 • Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur

Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Conservative Offline Distributional Reinforcement Learning

1 code implementation • NeurIPS 2021 • Yecheng Jason Ma, Dinesh Jayaraman, Osbert Bastani

We prove that CODAC learns a conservative return distribution -- in particular, for finite MDPs, CODAC converges to an uniform lower bound on the quantiles of the return distribution; our proof relies on a novel analysis of the distributional Bellman operator.

D4RL Distributional Reinforcement Learning +4

Paper
Code

Improving Human Sequential Decision-Making with Reinforcement Learning

no code implementations • 19 Aug 2021 • Hamsa Bastani, Osbert Bastani, Wichinpong Park Sinchaisri

Workers spend a significant amount of time learning how to make good decisions.

BIG-bench Machine Learning Decision Making +1

Paper
Add Code

Robust Generalization of Quadratic Neural Networks via Function Identification

no code implementations • 22 Sep 2021 • Kan Xu, Hamsa Bastani, Osbert Bastani

We study this problem from the perspective of the statistical concept of parameter identification.

Generalization Bounds Learning Theory +2

Paper
Add Code

Sequential Covariate Shift Detection Using Classifier Two-Sample Tests

no code implementations • 29 Sep 2021 • Sooyong Jang, Sangdon Park, Insup Lee, Osbert Bastani

This problem can naturally be solved using a two-sample test--- i. e., test whether the current test distribution of covariates equals the training distribution of covariates.

Vocal Bursts Valence Prediction

Paper
Add Code

PAC Synthesis of Machine Learning Programs

no code implementations • NeurIPS Workshop AIPLANS 2021 • Osbert Bastani

We study the problem of synthesizing programs that include machine learning components such as deep neural networks (DNNs).

BIG-bench Machine Learning Image Classification +1

Paper
Add Code

Synthesizing Video Trajectory Queries

no code implementations • NeurIPS Workshop AIPLANS 2021 • Stephen Mell, Favyen Bastani, Stephan Zdancewic, Osbert Bastani

A key challenge is that queries are difficult for end users to develop: queries must reason about complex spatial and temporal patterns in object trajectories in order to select trajectories of interest, and predicates often include real-valued parameters (e. g., whether two cars are within a certain distance) that can be tedious to manually tune.

Active Learning Object Tracking

Paper
Add Code

Synthesizing Machine Learning Programs with PAC Guarantees via Statistical Sketching

no code implementations • 11 Oct 2021 • Osbert Bastani

We study the problem of synthesizing programs that include machine learning components such as deep neural networks (DNNs).

BIG-bench Machine Learning Classification +2

Paper
Add Code

Uniformly Conservative Exploration in Reinforcement Learning

1 code implementation • 25 Oct 2021 • Wanqiao Xu, Jason Yecheng Ma, Kan Xu, Hamsa Bastani, Osbert Bastani

A key challenge to deploying reinforcement learning in practice is avoiding excessive (harmful) exploration in individual episodes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

1 code implementation • 14 Dec 2021 • Yecheng Jason Ma, Andrew Shen, Osbert Bastani, Dinesh Jayaraman

Further, CAP adaptively tunes this penalty during training using true cost feedback from the environment.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching

2 code implementations • 4 Feb 2022 • Yecheng Jason Ma, Andrew Shen, Dinesh Jayaraman, Osbert Bastani

We propose State Matching Offline DIstribution Correction Estimation (SMODICE), a novel and versatile regression-based offline imitation learning (IL) algorithm derived via state-occupancy matching.

Imitation Learning Reinforcement Learning (RL)

Paper
Code

Understanding Robust Generalization in Learning Regular Languages

no code implementations • 20 Feb 2022 • Soham Dan, Osbert Bastani, Dan Roth

Currently, deep neural networks struggle to generalize robustly to such shifts in the data distribution.

Paper
Add Code

Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option Templates

1 code implementation • 25 Feb 2022 • Souradeep Dutta, Kaustubh Sridhar, Osbert Bastani, Edgar Dobriban, James Weimer, Insup Lee, Julia Parish-Morris

We formulate expert intervention as allowing the agent to execute option templates before learning an implementation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Towards PAC Multi-Object Detection and Tracking

no code implementations • 15 Apr 2022 • Shuo Li, Sangdon Park, Xiayan Ji, Insup Lee, Osbert Bastani

Accurately detecting and tracking multi-objects is important for safety-critical applications such as autonomous navigation.

Autonomous Navigation Conformal Prediction +3

Paper
Add Code

Counterfactual Explanations for Natural Language Interfaces

1 code implementation • ACL 2022 • George Tolkachev, Stephen Mell, Steve Zdancewic, Osbert Bastani

A key challenge facing natural language interfaces is enabling users to understand the capabilities of the underlying system.

counterfactual Semantic Parsing

Paper
Code

Practical Adversarial Multivalid Conformal Prediction

1 code implementation • 2 Jun 2022 • Osbert Bastani, Varun Gupta, Christopher Jung, Georgy Noarov, Ramya Ramalingam, Aaron Roth

It is computationally lightweight -- comparable to split conformal prediction -- but does not require having a held-out validation set, and so all data can be used for training models from which to derive a conformal score.

Conformal Prediction

Paper
Code

Specification-Guided Learning of Nash Equilibria with High Social Welfare

no code implementations • 6 Jun 2022 • Kishor Jothimurugan, Suguman Bansal, Osbert Bastani, Rajeev Alur

Our empirical evaluation demonstrates that our algorithm computes equilibrium policies with high social welfare, whereas state-of-the-art baselines either fail to compute Nash equilibria or compute ones with comparatively lower social welfare.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via $f$-Advantage Regression

1 code implementation • 7 Jun 2022 • Yecheng Jason Ma, Jason Yan, Dinesh Jayaraman, Osbert Bastani

Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill learning in the form of reaching diverse goals from purely offline datasets.

regression reinforcement-learning +1

Paper
Code

PAC Prediction Sets for Meta-Learning

no code implementations • 6 Jul 2022 • Sangdon Park, Edgar Dobriban, Insup Lee, Osbert Bastani

Uncertainty quantification is a key component of machine learning models targeted at safety-critical systems such as in healthcare or autonomous vehicles.

Autonomous Vehicles Meta-Learning +1

Paper
Add Code

VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training

1 code implementation • 30 Sep 2022 • Yecheng Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani, Vikash Kumar, Amy Zhang

Given the inherent cost and scarcity of in-domain, task-specific robot data, learning from large, diverse, offline human videos has emerged as a promising path towards acquiring a generally useful visual representation for control; however, how these human videos can be used for general-purpose reward learning remains an open question.

Offline RL Open-Ended Question Answering +2

119

Paper
Code

Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms

no code implementations • 11 Nov 2022 • Vashist Avadhanula, Omar Abdul Baki, Hamsa Bastani, Osbert Bastani, Caner Gocmen, Daniel Haimovich, Darren Hwang, Dima Karamshuk, Thomas Leeper, Jiayuan Ma, Gregory Macnamara, Jake Mullett, Christopher Palow, Sung Park, Varun S Rajagopal, Kevin Schaeffer, Parikshit Shah, Deeksha Sinha, Nicolas Stier-Moses, Peng Xu

We describe the current content moderation strategy employed by Meta to remove policy-violating content from its platforms.

Paper
Add Code

Decision-Aware Learning for Optimizing Health Supply Chains

no code implementations • 15 Nov 2022 • Tsai-Hsuan Chung, Vahid Rostami, Hamsa Bastani, Osbert Bastani

We apply our framework to optimize the distribution of essential medicines in collaboration with policymakers in Sierra Leone; highly uncertain demand and limited budgets currently result in excessive unmet demand.

Paper
Add Code

ACon$^2$: Adaptive Conformal Consensus for Provable Blockchain Oracles

1 code implementation • 17 Nov 2022 • Sangdon Park, Osbert Bastani, Taesoo Kim

To address the oracle problem, we propose an adaptive conformal consensus (ACon$^2$) algorithm that derives a consensus set of data from multiple oracle contracts via the recent advance in online uncertainty quantification learning.

Uncertainty Quantification

Paper
Code

Angelic Patches for Improving Third-Party Object Detector Performance

1 code implementation • CVPR 2023 • Wenwen Si, Shuo Li, Sangdon Park, Insup Lee, Osbert Bastani

Experiments demonstrate the efficacy of the partial-covering patch in solving the complex bounding box problem.

Adversarial Attack Object +2

Paper
Code

SPARLING: Learning Latent Representations with Extremely Sparse Activations

no code implementations • 3 Feb 2023 • Kavi Gupta, Osbert Bastani, Armando Solar-Lezama

Real-world processes often contain intermediate state that can be modeled as an extremely sparse tensor.

Optical Character Recognition (OCR)

Paper
Add Code

Robust Subtask Learning for Compositional Generalization

1 code implementation • 6 Feb 2023 • Kishor Jothimurugan, Steve Hsu, Osbert Bastani, Rajeev Alur

We formulate the problem as a two agent zero-sum game in which the adversary picks the sequence of subtasks.

Paper
Code

Learning Performance-Improving Code Edits

2 code implementations • 15 Feb 2023 • Alexander Shypula, Aman Madaan, Yimeng Zeng, Uri Alon, Jacob Gardner, Milad Hashemi, Graham Neubig, Parthasarathy Ranganathan, Osbert Bastani, Amir Yazdanbakhsh

Next, we propose a broad range of adaptation strategies for code optimization; for prompting, these include retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.

Code Generation Code Repair +2

Paper
Code

PAC Prediction Sets for Large Language Models of Code

1 code implementation • 17 Feb 2023 • Adam Khakhar, Stephen Mell, Osbert Bastani

Given a trained code generation model, our algorithm leverages a programming language's abstract syntax tree to generate a set of programs such that the correct program is in the set with high-confidence.

Code Generation Semantic Parsing +1

Paper
Code

TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy Matching

no code implementations • 22 May 2023 • Yecheng Jason Ma, Kausik Sivakumar, Jason Yan, Osbert Bastani, Dinesh Jayaraman

Standard model-based reinforcement learning (MBRL) approaches fit a transition model of the environment to all past experience, but this wastes model capacity on data that is irrelevant for policy improvement.

Model-based Reinforcement Learning reinforcement-learning

Paper
Add Code

Inverse Protein Folding Using Deep Bayesian Optimization

no code implementations • 25 May 2023 • Natalie Maus, Yimeng Zeng, Daniel Allen Anderson, Phillip Maffettone, Aaron Solomon, Peyton Greenside, Osbert Bastani, Jacob R. Gardner

Furthermore, it is challenging to adapt pure generative approaches to other settings, e. g., when constraints exist.

Bayesian Optimization Protein Folding

Paper
Add Code

Policy Synthesis and Reinforcement Learning for Discounted LTL

no code implementations • 26 May 2023 • Rajeev Alur, Osbert Bastani, Kishor Jothimurugan, Mateo Perez, Fabio Somenzi, Ashutosh Trivedi

The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL).

PAC learning reinforcement-learning +1

Paper
Add Code

LIV: Language-Image Representations and Rewards for Robotic Control

1 code implementation • 1 Jun 2023 • Yecheng Jason Ma, William Liang, Vaidehi Som, Vikash Kumar, Amy Zhang, Osbert Bastani, Dinesh Jayaraman

We present Language-Image Value learning (LIV), a unified objective for vision-language representation and reward learning from action-free videos with text annotations.

Contrastive Learning Imitation Learning

Paper
Code

TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal Prediction

1 code implementation • 7 Jul 2023 • Shuo Li, Sangdon Park, Insup Lee, Osbert Bastani

To address this challenge, we propose the Trustworthy Retrieval Augmented Question Answering, or $\textit{TRAQ}$, which provides the first end-to-end statistical correctness guarantee for RAG.

Bayesian Optimization Chatbot +4

Paper
Code

Rethinking Fairness for Human-AI Collaboration

no code implementations • 5 Oct 2023 • Haosen Ge, Hamsa Bastani, Osbert Bastani

However, we show that it may be infeasible to design algorithmic recommendations that are simultaneously fair in isolation, compliance-robustly fair, and more accurate than the human policy; thus, if our goal is to improve the equity and accuracy of human-AI collaboration, it may not be desirable to enforce traditional fairness constraints.

Fairness

Paper
Add Code

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

no code implementations • 12 Oct 2023 • Zichen Zhang, Yunshuang Li, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Yecheng Jason Ma, Luca Weihs

Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks.

reinforcement-learning

Paper
Add Code

PAC Prediction Sets Under Label Shift

1 code implementation • 19 Oct 2023 • Wenwen Si, Sangdon Park, Insup Lee, Edgar Dobriban, Osbert Bastani

We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.

Uncertainty Quantification

Paper
Code

Eureka: Human-Level Reward Design via Coding Large Language Models

1 code implementation • 19 Oct 2023 • Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar

The generality of Eureka also enables a new gradient-free in-context learning approach to reinforcement learning from human feedback (RLHF), readily incorporating human inputs to improve the quality and the safety of the generated rewards without model updating.

Decision Making In-Context Learning +1

2,590

Paper
Code

Generative Adversarial Bayesian Optimization for Surrogate Objectives

1 code implementation • 9 Feb 2024 • Michael S. Yao, Yimeng Zeng, Hamsa Bastani, Jacob Gardner, James C. Gee, Osbert Bastani

To address this limitation, we propose generative adversarial Bayesian optimization (GABO) using adaptive source critic regularization, a task-agnostic framework for Bayesian optimization that employs a Lipschitz-bounded source critic model to constrain the optimization trajectory to regions where the surrogate function is reliable.

Bayesian Optimization

Paper
Code

Uncertainty in Language Models: Assessment through Rank-Calibration

no code implementations • 4 Apr 2024 • Xinmeng Huang, Shuo Li, Mengxin Yu, Matteo Sesia, Hamed Hassani, Insup Lee, Osbert Bastani, Edgar Dobriban

Language Models (LMs) have shown promising performance in natural language generation.

Text Generation

Paper
Add Code

Generating Programmatic Referring Expressions via Program Synthesis

1 code implementation • ICML 2020 • Jiani Huang, Calvin Smith, Osbert Bastani, Rishabh Singh, Aws Albarghouthi, Mayur Naik

The policy neural network employs a program interpreter that provides immediate feedback on the consequences of the decisions made by the policy, and also takes into account the uncertainty in the symbolic representation of the image.

Enumerative Search Logical Reasoning

Paper
Code

Robust Black Box Explanations Under Distribution Shift

no code implementations • ICML 2020 • Himabindu Lakkaraju, Nino Arsov, Osbert Bastani

As machine learning black boxes are increasingly being deployed in real-world applications, there has been a growing interest in developing post hoc explanations that summarize the behaviors of these black box models.

Paper
Add Code

Few-Shot Novel Concept Learning for Semantic Parsing

no code implementations • Findings (EMNLP) 2021 • Soham Dan, Osbert Bastani, Dan Roth

This way the concept learning problem is naturally a program synthesis problem and our algorithm learns from a few examples to synthesize a program representing the novel concept.

Novel Concepts Program Synthesis +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.