Search Results for author: Eric Horvitz

Found 59 papers, 15 papers with code

MedFuzz: Exploring the Robustness of Large Language Models in Medical Question Answering

no code implementations3 Jun 2024 Robert Osazuwa Ness, Katie Matton, Hayden Helm, Sheng Zhang, Junaid Bajwa, Carey E. Priebe, Eric Horvitz

Medical question-answering benchmarks rely on assumptions consistent with quantifying LLM performance but that may not hold in the open world of the clinic.

Question Answering

The Rise of the AI Co-Pilot: Lessons for Design from Aviation and Beyond

no code implementations16 Nov 2023 Abigail Sellen, Eric Horvitz

This calls for designs for human-AI partnership that cede ultimate control and responsibility to the human user as pilot, with the AI co-pilot acting in a well-defined supporting role.

Frontier AI Regulation: Managing Emerging Risks to Public Safety

no code implementations6 Jul 2023 Markus Anderljung, Joslyn Barnhart, Anton Korinek, Jade Leung, Cullen O'Keefe, Jess Whittlestone, Shahar Avin, Miles Brundage, Justin Bullock, Duncan Cass-Beggs, Ben Chang, Tantum Collins, Tim Fist, Gillian Hadfield, Alan Hayes, Lewis Ho, Sara Hooker, Eric Horvitz, Noam Kolt, Jonas Schuett, Yonadav Shavit, Divya Siddarth, Robert Trager, Kevin Wolf

To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models.

Accurate Measures of Vaccination and Concerns of Vaccine Holdouts from Web Search Logs

1 code implementation12 Jun 2023 Serina Chang, Adam Fourney, Eric Horvitz

We find that holdouts, compared to early adopters matched on covariates, are 69% more likely to click on untrusted news sites.

When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming

1 code implementation8 Jun 2023 Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz

Using data from 535 programmers, we perform a retrospective evaluation of CDHF and show that we can avoid displaying a significant fraction of suggestions that would have been rejected.

Recommendation Systems

Ideal Abstractions for Decision-Focused Learning

no code implementations29 Mar 2023 Michael Poli, Stefano Massaroli, Stefano Ermon, Bryan Wilder, Eric Horvitz

We present a methodology for formulating simplifying abstractions in machine learning systems by identifying and harnessing the utility structure of decisions.

Decision Making Management

Sparks of Artificial General Intelligence: Early experiments with GPT-4

2 code implementations22 Mar 2023 Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.

Arithmetic Reasoning Math Word Problem Solving

Benchmarking Spatial Relationships in Text-to-Image Generation

1 code implementation20 Dec 2022 Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.

Benchmarking Text-to-Image Generation

Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming

1 code implementation25 Oct 2022 Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz

However, to fully realize their potential, we must understand how programmers interact with these systems and identify ways to improve that interaction.

Code Completion Recommendation Systems

On the Horizon: Interactive and Compositional Deepfakes

no code implementations5 Sep 2022 Eric Horvitz

Over a five-year period, computing methods for generating high-fidelity, fictional depictions of people and events moved from exotic demonstrations by computer science research teams into ongoing use as a tool of disinformation.

Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging

no code implementations19 May 2022 Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Michael Fitzke, Mark Parkinson, Diane Wilson, Paul Fisher, Eric Horvitz, Kori Inkpen, Besmira Nushi

A critical aspect of interaction design for AI-assisted human decision making are policies about the display and sequencing of AI inferences within larger decision-making workflows.

Decision Making

A Computational Inflection for Scientific Discovery

no code implementations4 May 2022 Tom Hope, Doug Downey, Oren Etzioni, Daniel S. Weld, Eric Horvitz

We stand at the foot of a significant inflection in the trajectory of scientific discovery.


Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow

no code implementations7 Jan 2022 Maarten Sap, Anna Jafarpour, Yejin Choi, Noah A. Smith, James W. Pennebaker, Eric Horvitz

We quantify the differences between autobiographical and imagined stories by introducing sequentiality, a measure of narrative flow of events, drawing probabilistic inferences from a cutting-edge large language model (GPT-3).

Language Modelling Large Language Model +2

Ideal Partition of Resources for Metareasoning

no code implementations18 Oct 2021 Eric Horvitz, John Breese

Thus, it is important to determine the portion of resources we wish to apply to metareasoning and control versus to the execution of a solution plan.

A Search Engine for Discovery of Scientific Challenges and Directions

1 code implementation NeurIPS Workshop AI4Scien 2021 Dan Lahav, Jon Saad Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S. Weld, Tom Hope

To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery.

Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery

no code implementations NeurIPS Workshop AI4Scien 2021 Jason Portenoy, Marissa Radensky, Jevin West, Eric Horvitz, Daniel Weld, Tom Hope

We also demonstrate an approach for displaying information about authors, boosting the ability to understand the work of new, unfamiliar scholars.

Platform for Situated Intelligence

1 code implementation29 Mar 2021 Dan Bohus, Sean Andrist, Ashley Feniello, Nick Saw, Mihai Jalobeanu, Patrick Sweeney, Anne Loomis Thompson, Eric Horvitz

We introduce Platform for Situated Intelligence, an open-source framework created to support the rapid development and study of multimodal, integrative-AI systems.

Formation of Social Ties Influences Food Choice: A Campus-Wide Longitudinal Study

no code implementations17 Feb 2021 Kristina Gligorić, Ryen W. White, Emre Kiciman, Eric Horvitz, Arnaud Chiolero, Robert West

To estimate causal effects from the passively observed log data, we control confounds in a matched quasi-experimental design: we identify focal users who at first do not have any regular eating partners but then start eating with a fixed partner regularly, and we match focal users into comparison pairs such that paired users are nearly identical with respect to covariates measured before acquiring the partner, where the two focal users' new eating partners diverge in the healthiness of their respective food choice.

Experimental Design Nutrition

Exploiting structured data for learning contagious diseases under incomplete testing

no code implementations1 Jan 2021 Maggie Makar, Lauren West, David Hooper, Eric Horvitz, Erica Shenoy, John Guttag

In this work we ask: can we build reliable infection prediction models when the observed data is collected under limited, and biased testing that prioritizes testing symptomatic individuals?

Understanding Failures of Deep Networks via Robust Feature Extraction

1 code implementation CVPR 2021 Sahil Singla, Besmira Nushi, Shital Shah, Ece Kamar, Eric Horvitz

Traditional evaluation metrics for learned models that report aggregate scores over a test set are insufficient for surfacing important and informative patterns of failure over features and instances.

Extracting a Knowledge Base of Mechanisms from COVID-19 Papers

3 code implementations NAACL 2021 Tom Hope, Aida Amini, David Wadden, Madeleine van Zuylen, Sravanthi Parasa, Eric Horvitz, Daniel Weld, Roy Schwartz, Hannaneh Hajishirzi

The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge.


Population-Scale Study of Human Needs During the COVID-19 Pandemic: Analysis and Implications

no code implementations17 Aug 2020 Jina Suh, Eric Horvitz, Ryen W. White, Tim Althoff

Most work to date on mitigating the COVID-19 pandemic is focused urgently on biomedicine and epidemiology.


An Empirical Analysis of Backward Compatibility in Machine Learning Systems

no code implementations11 Aug 2020 Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, Eric Horvitz

In many applications of machine learning (ML), updates are performed with the goal of enhancing model performance.

BIG-bench Machine Learning

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

no code implementations EMNLP 2020 Tom Hope, Jason Portenoy, Kishore Vasan, Jonathan Borchardt, Eric Horvitz, Daniel S. Weld, Marti A. Hearst, Jevin West

The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions.

Language Modelling

Learning to Complement Humans

no code implementations1 May 2020 Bryan Wilder, Eric Horvitz, Ece Kamar

A rising vision for AI in the open world centers on the development of systems that can complement humans for perceptual, diagnostic, and reasoning tasks.

BIG-bench Machine Learning Medical Diagnosis

Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork

no code implementations27 Apr 2020 Gagan Bansal, Besmira Nushi, Ece Kamar, Eric Horvitz, Daniel S. Weld

To optimize the team performance for this setting we maximize the team's expected utility, expressed in terms of the quality of the final decision, cost of verifying, and individual accuracies of people and machines.

Decision Making

SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions

no code implementations CVPR 2020 Ramprasaath R. Selvaraju, Purva Tendulkar, Devi Parikh, Eric Horvitz, Marco Ribeiro, Besmira Nushi, Ece Kamar

We quantify the extent to which this phenomenon occurs by creating a new Reasoning split of the VQA dataset and collecting VQA-introspect, a new dataset1 which consists of 238K new perception questions which serve as sub questions corresponding to the set of perceptual tasks needed to effectively answer the complex reasoning questions in the Reasoning split.

Visual Question Answering (VQA)

Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting

2 code implementations NeurIPS 2019 Aditya Grover, Jiaming Song, Alekh Agarwal, Kenneth Tran, Ashish Kapoor, Eric Horvitz, Stefano Ermon

A standard technique to correct this bias is importance sampling, where samples from the model are weighted by the likelihood ratio under model and true distributions.

Data Augmentation

A Case for Backward Compatibility for Human-AI Teams

no code implementations4 Jun 2019 Gagan Bansal, Besmira Nushi, Ece Kamar, Dan Weld, Walter Lasecki, Eric Horvitz

We introduce the notion of the compatibility of an AI update with prior user experience and present methods for studying the role of compatibility in human-AI teams.

Decision Making

Efficient Forward Architecture Search

2 code implementations NeurIPS 2019 Hanzhang Hu, John Langford, Rich Caruana, Saurajit Mukherjee, Eric Horvitz, Debadeepta Dey

We propose a neural architecture search (NAS) algorithm, Petridish, to iteratively add shortcut connections to existing network layers.

feature selection Neural Architecture Search +1

Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations

no code implementations12 May 2019 Aditya Modi, Debadeepta Dey, Alekh Agarwal, Adith Swaminathan, Besmira Nushi, Sean Andrist, Eric Horvitz

We address the opportunity to maximize the utility of an overall computing system by employing reinforcement learning to guide the configuration of the set of interacting modules that comprise the system.

Decision Making reinforcement-learning +1

Bias Correction of Learned Generative Models via Likelihood-free Importance Weighting

no code implementations ICLR Workshop DeepGenStruct 2019 Aditya Grover, Jiaming Song, Ashish Kapoor, Kenneth Tran, Alekh Agarwal, Eric Horvitz, Stefano Ermon

A standard technique to correct this bias is by importance weighting samples from the model by the likelihood ratio under the model and true distributions.

Data Augmentation

Reverse-Engineering Satire, or "Paper on Computational Humor Accepted Despite Making Serious Advances"

1 code implementation10 Jan 2019 Robert West, Eric Horvitz

Starting from the observation that satirical news headlines tend to resemble serious news headlines, we build and analyze a corpus of satirical headlines paired with nearly identical but serious headlines.

Humor Detection Sentence

Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure

no code implementations19 Sep 2018 Besmira Nushi, Ece Kamar, Eric Horvitz

We present Pandora, a set of hybrid human-machine methods and tools for describing and explaining system failures.

BIG-bench Machine Learning Image Captioning

Discovering Blind Spots in Reinforcement Learning

no code implementations23 May 2018 Ramya Ramakrishnan, Ece Kamar, Debadeepta Dey, Julie Shah, Eric Horvitz

Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments.

reinforcement-learning Reinforcement Learning (RL)

Active Learning amidst Logical Knowledge

1 code implementation26 Sep 2017 Emmanouil Antonios Platanios, Ashish Kapoor, Eric Horvitz

Structured prediction is ubiquitous in applications of machine learning such as knowledge extraction and natural language processing.

Active Learning BIG-bench Machine Learning +1

Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration

no code implementations28 Oct 2016 Himabindu Lakkaraju, Ece Kamar, Rich Caruana, Eric Horvitz

Predictive models deployed in the real world may assign incorrect labels to instances with high confidence.

Long-Term Trends in the Public Perception of Artificial Intelligence

no code implementations16 Sep 2016 Ethan Fast, Eric Horvitz

We find that discussion of AI has increased sharply since 2009, and that these discussions have been consistently more optimistic than pessimistic.

Identifying Dogmatism in Social Media: Signals and Models

no code implementations EMNLP 2016 Ethan Fast, Eric Horvitz

When we use our predictive model to analyze millions of other Reddit posts, we find evidence that suggests dogmatism is a deeper personality trait, present for dogmatic users across many different domains, and that users who engage on dogmatic comments tend to show increases in dogmatic posts themselves.

Learning to Hire Teams

no code implementations12 Aug 2015 Adish Singla, Eric Horvitz, Pushmeet Kohli, Andreas Krause

Furthermore, we consider an embedding of the tasks and workers in an underlying graph that may arise from task similarities or social ties, and that can provide additional side-observations for faster learning.

Metareasoning for Planning Under Uncertainty

no code implementations3 May 2015 Christopher H. Lin, Andrey Kolobov, Ece Kamar, Eric Horvitz

Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking.

Information Gathering in Networks via Active Exploration

no code implementations24 Apr 2015 Adish Singla, Eric Horvitz, Pushmeet Kohli, Ryen White, Andreas Krause

How should we gather information in a network, where each node's visibility is limited to its local neighborhood?

Experimental Design Informativeness +1

Inferring and Learning from Neuronal Correspondences

no code implementations23 Jan 2015 Ashish Kapoor, E. Paxon Frady, Stefanie Jegelka, William B. Kristan, Eric Horvitz

We introduce and study methods for inferring and learning from correspondences among neurons.

Decision Making

Stochastic Privacy

no code implementations22 Apr 2014 Adish Singla, Eric Horvitz, Ece Kamar, Ryen White

Users may be willing to share private information in return for better quality of service or for incentives, or in return for assurances about the nature and extend of the logging of data.

A Utility-Theoretic Approach to Privacy in Online Services

no code implementations16 Jan 2014 Andreas Krause, Eric Horvitz

We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service.

Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (1996)

no code implementations13 Apr 2013 Eric Horvitz, Finn Jensen

This is the Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence, which was held in Portland, OR, August 1-4, 1996

Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification Task

no code implementations NeurIPS 2012 Jenna Wiens, Eric Horvitz, John V. Guttag

A patient's risk for adverse events is affected by temporal processes including the nature and timing of diagnostic and therapeutic activities, and the overall evolution of the patient's pathophysiology over time.

Classification General Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.