Search Results for author: Mert Yuksekgonul

Found 15 papers, 11 papers with code

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

1 code implementation • 8 Feb 2024 • Federico Bianchi, Patrick John Chia, Mert Yuksekgonul, Jacopo Tagliabue, Dan Jurafsky, James Zou

We develop NegotiationArena: a flexible framework for evaluating and probing the negotiation abilities of LLM agents.

Paper
Code

ChatGPT Exhibits Gender and Racial Biases in Acute Coronary Syndrome Management

no code implementations • 10 Nov 2023 • Angela Zhang, Mert Yuksekgonul, Joshua Guild, James Zou, Joseph C. Wu

One early application has been to medicine, where LLMs have been investigated to streamline clinical workflows and facilitate clinical analysis and decision-making.

Decision Making Management

Paper
Add Code

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

1 code implementation • 24 Oct 2023 • Marah I Abdin, Suriya Gunasekar, Varun Chandrasekaran, Jerry Li, Mert Yuksekgonul, Rahee Ghosh Peshawaria, Ranjita Naik, Besmira Nushi

Motivated by rising concerns around factual incorrectness and hallucinations of LLMs, we present KITAB, a new dataset for measuring constraint satisfaction abilities of language models.

Information Retrieval Retrieval

Paper
Code

Diversity of Thought Improves Reasoning Abilities of LLMs

no code implementations • 11 Oct 2023 • Ranjita Naik, Varun Chandrasekaran, Mert Yuksekgonul, Hamid Palangi, Besmira Nushi

Large language models (LLMs) are documented to struggle in settings that require complex reasoning.

Paper
Add Code

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

1 code implementation • 26 Sep 2023 • Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text.

Paper
Code

Discover and Cure: Concept-aware Mitigation of Spurious Correlation

1 code implementation • 1 May 2023 • Shirley Wu, Mert Yuksekgonul, Linjun Zhang, James Zou

Deep neural networks often rely on spurious correlations to make predictions, which hinders generalization beyond training environments.

Lesion Classification Object Recognition +1

Paper
Code

GPT detectors are biased against non-native English writers

2 code implementations • 6 Apr 2023 • Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou

In this study, we evaluate the performance of several widely-used GPT detectors using writing samples from native and non-native English writers.

Fairness

6,381

Paper
Code

Leveraging medical Twitter to build a visual–language foundation model for pathology AI

1 code implementation • bioRxiv 2023 • Zhi Huang, Federico Bianchi, Mert Yuksekgonul, Thomas Montine, James Zou

This is the largest public dataset for pathology images annotated with natural text.

Transfer Learning

206

Paper
Code

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis

no code implementations • 1 Feb 2023 • Roxana Daneshjou, Mert Yuksekgonul, Zhuo Ran Cai, Roberto Novoa, James Zou

To provide a medical dataset densely annotated by domain experts with annotations useful across multiple disease processes, we developed SkinCon: a skin disease dataset densely annotated by dermatologists.

Interpretable Machine Learning

Paper
Add Code

Holistic Evaluation of Language Models

1 code implementation • 16 Nov 2022 • Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda

We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models.

Fairness Question Answering

1,625

Paper
Code

When and why vision-language models behave like bags-of-words, and what to do about it?

1 code implementation • 4 Oct 2022 • Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou

ARO consists of Visual Genome Attribution, to test the understanding of objects' properties; Visual Genome Relation, to test for relational understanding; and COCO & Flickr30k-Order, to test for order sensitivity.

Contrastive Learning Retrieval +1

199

Paper
Code

Post-hoc Concept Bottleneck Models

no code implementations • 31 May 2022 • Mert Yuksekgonul, Maggie Wang, James Zou

When concept annotations are not available on the training data, we show that PCBM can transfer concepts from other datasets or from natural language descriptions of concepts via multimodal models.

Model Editing

Paper
Add Code

Meaningfully Debugging Model Mistakes using Conceptual Counterfactual Explanations

1 code implementation • 24 Jun 2021 • Abubakar Abid, Mert Yuksekgonul, James Zou

Understanding and explaining the mistakes made by trained models is critical to many machine learning objectives, such as improving robustness, addressing concept drift, and mitigating biases.

counterfactual

Paper
Code

ImageNet performance correlates with pose estimation robustness and generalization on out-of-domain data

1 code implementation • ICML UDL 2020 • Alexander Mathis, Thomas Biasi, Mert Yuksekgonul, Byron Rogers, Matthias Bethge, Mackenzie Weygandt Mathis

Neural networks are highly effective tools for pose estimation.

Animal Pose Estimation Benchmarking +1

4,274

Paper
Code

Learning Maximally Predictive Prototypes in Multiple Instance Learning

1 code implementation • 2 Oct 2019 • Mert Yuksekgonul, Ozgur Emre Sivrikaya, Mustafa Gokce Baydogan

In this work, we propose a simple model that provides permutation invariant maximally predictive prototype generator from a given dataset, which leads to interpretability of the solution and concrete insights to the nature and the solution of a problem.

Multiple Instance Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.