Search Results for author: Mert Yuksekgonul

Found 15 papers, 11 papers with code

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

1 code implementation8 Feb 2024 Federico Bianchi, Patrick John Chia, Mert Yuksekgonul, Jacopo Tagliabue, Dan Jurafsky, James Zou

We develop NegotiationArena: a flexible framework for evaluating and probing the negotiation abilities of LLM agents.

ChatGPT Exhibits Gender and Racial Biases in Acute Coronary Syndrome Management

no code implementations10 Nov 2023 Angela Zhang, Mert Yuksekgonul, Joshua Guild, James Zou, Joseph C. Wu

One early application has been to medicine, where LLMs have been investigated to streamline clinical workflows and facilitate clinical analysis and decision-making.

Decision Making Management

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

1 code implementation24 Oct 2023 Marah I Abdin, Suriya Gunasekar, Varun Chandrasekaran, Jerry Li, Mert Yuksekgonul, Rahee Ghosh Peshawaria, Ranjita Naik, Besmira Nushi

Motivated by rising concerns around factual incorrectness and hallucinations of LLMs, we present KITAB, a new dataset for measuring constraint satisfaction abilities of language models.

Information Retrieval Retrieval

Diversity of Thought Improves Reasoning Abilities of LLMs

no code implementations11 Oct 2023 Ranjita Naik, Varun Chandrasekaran, Mert Yuksekgonul, Hamid Palangi, Besmira Nushi

Large language models (LLMs) are documented to struggle in settings that require complex reasoning.

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

1 code implementation26 Sep 2023 Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text.

Discover and Cure: Concept-aware Mitigation of Spurious Correlation

1 code implementation1 May 2023 Shirley Wu, Mert Yuksekgonul, Linjun Zhang, James Zou

Deep neural networks often rely on spurious correlations to make predictions, which hinders generalization beyond training environments.

Lesion Classification Object Recognition +1

GPT detectors are biased against non-native English writers

2 code implementations6 Apr 2023 Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou

In this study, we evaluate the performance of several widely-used GPT detectors using writing samples from native and non-native English writers.

Fairness

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis

no code implementations1 Feb 2023 Roxana Daneshjou, Mert Yuksekgonul, Zhuo Ran Cai, Roberto Novoa, James Zou

To provide a medical dataset densely annotated by domain experts with annotations useful across multiple disease processes, we developed SkinCon: a skin disease dataset densely annotated by dermatologists.

Interpretable Machine Learning

When and why vision-language models behave like bags-of-words, and what to do about it?

1 code implementation4 Oct 2022 Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou

ARO consists of Visual Genome Attribution, to test the understanding of objects' properties; Visual Genome Relation, to test for relational understanding; and COCO & Flickr30k-Order, to test for order sensitivity.

Contrastive Learning Retrieval +1

Post-hoc Concept Bottleneck Models

no code implementations31 May 2022 Mert Yuksekgonul, Maggie Wang, James Zou

When concept annotations are not available on the training data, we show that PCBM can transfer concepts from other datasets or from natural language descriptions of concepts via multimodal models.

Model Editing

Meaningfully Debugging Model Mistakes using Conceptual Counterfactual Explanations

1 code implementation24 Jun 2021 Abubakar Abid, Mert Yuksekgonul, James Zou

Understanding and explaining the mistakes made by trained models is critical to many machine learning objectives, such as improving robustness, addressing concept drift, and mitigating biases.

counterfactual

Learning Maximally Predictive Prototypes in Multiple Instance Learning

1 code implementation2 Oct 2019 Mert Yuksekgonul, Ozgur Emre Sivrikaya, Mustafa Gokce Baydogan

In this work, we propose a simple model that provides permutation invariant maximally predictive prototype generator from a given dataset, which leads to interpretability of the solution and concrete insights to the nature and the solution of a problem.

Multiple Instance Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.