Search Results for author: Michael Saxon

Found 21 papers, 12 papers with code

Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)

1 code implementation • 5 Apr 2024 • Michael Saxon, Fatima Jahara, Mahsa Khoshnoodi, Yujie Lu, Aditya Sharma, William Yang Wang

With advances in the quality of text-to-image (T2I) models has come interest in benchmarking their prompt faithfulness-the semantic coherence of generated images to the prompts they were conditioned on.

Benchmarking

Paper
Code

Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts

no code implementations • 17 Mar 2024 • Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wang

Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set.

Translation

Paper
Add Code

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

1 code implementation • 6 Aug 2023 • Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang

Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.

Hallucination

304

Paper
Code

Multilingual Conceptual Coverage in Text-to-Image Models

1 code implementation • 2 Jun 2023 • Michael Saxon, William Yang Wang

We propose "Conceptual Coverage Across Languages" (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns.

Benchmarking

Paper
Code

Let's Think Frame by Frame with VIP: A Video Infilling and Prediction Dataset for Evaluating Video Chain-of-Thought

1 code implementation • 23 May 2023 • Vaishnavi Himakunthala, Andy Ouyang, Daniel Rose, Ryan He, Alex Mei, Yujie Lu, Chinmay Sonar, Michael Saxon, William Yang Wang

Despite exciting recent results showing vision-language systems' capacity to reason about images using natural language, their capacity for video reasoning remains under-explored.

Descriptive Video Prediction

Paper
Code

Data Augmentation for Diverse Voice Conversion in Noisy Environments

no code implementations • 18 May 2023 • Avani Tanna, Michael Saxon, Amr El Abbadi, William Yang Wang

Voice conversion (VC) models have demonstrated impressive few-shot conversion quality on the clean, native speech populations they're trained on.

Data Augmentation Denoising +1

Paper
Add Code

Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings

no code implementations • 3 May 2023 • Daniel Rose, Vaishnavi Himakunthala, Andy Ouyang, Ryan He, Alex Mei, Yujie Lu, Michael Saxon, Chinmay Sonar, Diba Mirza, William Yang Wang

Recent advances in large language models elicit reasoning in a chain-of-thought that allows models to decompose problems in a human-like fashion.

Data Augmentation Question Answering +1

Paper
Add Code

Users are the North Star for AI Transparency

no code implementations • 9 Mar 2023 • Alex Mei, Michael Saxon, Shiyu Chang, Zachary C. Lipton, William Yang Wang

We conduct a broad literature survey, identifying many clusters of similar conceptions of transparency, tying each back to our north star with analysis of how it furthers or hinders our ideal AI transparency goals.

Paper
Add Code

Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

1 code implementation • NeurIPS 2023 • Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, William Yang Wang

This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.

Few-Shot Learning GSM8K +5

Paper
Code

CausalDialogue: Modeling Utterance-level Causality in Conversations

1 code implementation • 20 Dec 2022 • Yi-Lin Tuan, Alon Albalak, Wenda Xu, Michael Saxon, Connor Pryor, Lise Getoor, William Yang Wang

Despite their widespread adoption, neural conversation models have yet to exhibit natural chat capabilities with humans.

Dialogue Generation

Paper
Code

WikiWhy: Answering and Explaining Cause-and-Effect Questions

no code implementations • 21 Oct 2022 • Matthew Ho, Aditya Sharma, Justin Chang, Michael Saxon, Sharon Levy, Yujie Lu, William Yang Wang

As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging.

Question Answering

Paper
Add Code

Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis

1 code implementation • 10 Oct 2022 • Wenda Xu, YiLin Tuan, Yujie Lu, Michael Saxon, Lei LI, William Yang Wang

Is it possible to build a general and automatic natural language generation (NLG) evaluation metric?

Image Captioning Machine Translation +2

Paper
Code

Causal Balancing for Domain Generalization

1 code implementation • 10 Jun 2022 • Xinyi Wang, Michael Saxon, Jiachen Li, Hongyang Zhang, Kun Zhang, William Yang Wang

While machine learning models rapidly advance the state-of-the-art on various real-world tasks, out-of-domain (OOD) generalization remains a challenging problem given the vulnerability of these models to spurious correlations.

Domain Generalization

Paper
Code

PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

no code implementations • 16 Dec 2021 • Michael Saxon, Xinyi Wang, Wenda Xu, William Yang Wang

Building natural language inference (NLI) benchmarks that are both challenging for modern techniques, and free from shortcut biases is difficult.

Natural Language Inference Sentence

Paper
Add Code

Self-Supervised Knowledge Assimilation for Expert-Layman Text Style Transfer

1 code implementation • 6 Oct 2021 • Wenda Xu, Michael Saxon, Misha Sra, William Yang Wang

This is a particularly notable issue in the medical domain, where layman are often confused by medical text online.

Language Modelling Self-Supervised Learning +2

Paper
Code

End-to-End Spoken Language Understanding for Generalized Voice Assistants

no code implementations • 16 Jun 2021 • Michael Saxon, Samridhi Choudhary, Joseph P. McKenna, Athanasios Mouchtaris

End-to-end (E2E) spoken language understanding (SLU) systems predict utterance semantics directly from speech using a single model.

Ranked #10 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Spoken Language Understanding

Paper
Add Code

Counterfactual Maximum Likelihood Estimation for Training Deep Networks

1 code implementation • NeurIPS 2021 • Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang

Although deep learning models have driven state-of-the-art performance on a wide array of tasks, they are prone to spurious correlations that should not be learned as predictive clues.

counterfactual Domain Generalization +2

Paper
Code

Investigating Memorization of Conspiracy Theories in Text Generation

1 code implementation • Findings (ACL) 2021 • Sharon Levy, Michael Saxon, William Yang Wang

In this work, we investigate the capability of language models to generate conspiracy theory text.

Hallucination Memorization +1

Paper
Code

Modeling Disclosive Transparency in NLP Application Descriptions

1 code implementation • EMNLP 2021 • Michael Saxon, Sharon Levy, Xinyi Wang, Alon Albalak, William Yang Wang

Broader disclosive transparency$-$truth and clarity in communication regarding the function of AI systems$-$is widely considered desirable.

Fairness Language Modelling +1

Paper
Code

Semantic Complexity in End-to-End Spoken Language Understanding

no code implementations • 6 Aug 2020 • Joseph P. McKenna, Samridhi Choudhary, Michael Saxon, Grant P. Strimel, Athanasios Mouchtaris

We perform experiments where we vary the semantic complexity of a large, proprietary dataset and show that STI model performance correlates with our semantic complexity measures, such that performance increases as complexity values decrease.

Spoken Language Understanding

Paper
Add Code

Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features

no code implementations • 26 Nov 2019 • Michael Saxon, Ayush Tripathi, Yishan Jiao, Julie Liss, Visar Berisha

To demonstrate that the features derived from these acoustic models are specific to hypernasal speech, we evaluate them across different dysarthria corpora.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.