Search Results for author: Ali Emami

Found 36 papers, 13 papers with code

Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations

1 code implementation31 May 2025 Pardis Sadat Zahraei, Ali Emami

This work emphasizes the need for targeted approaches to gender and semantic coherence in machine translation, particularly for genderless languages, contributing to more equitable and accurate translation systems.

Language Modeling Language Modelling +3

Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition

no code implementations29 Apr 2025 Tyler McDonald, Ali Emami

Knowledge distillation allows smaller neural networks to emulate the performance of larger, teacher models with reduced computational demands.

GSM8K Knowledge Distillation +2

TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models

no code implementations10 Apr 2025 Sher Badshah, Ali Emami, Hassan Sajjad

TALE enhances the reliability of LLM evaluations in real-world, dynamic scenarios without relying on static references.

Question Answering

Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books

1 code implementation7 Feb 2025 Sangmitra Madhusudan, Robert Morabito, Skye Reid, Nikta Gohari Sadr, Ali Emami

Our findings indicate that LLMs trained on decade-specific books manifest biases reflective of their times, with both gradual trends and notable shifts.

Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts

no code implementations5 Feb 2025 Nikta Gohari Sadr, Sangmitra Madhusudan, Ali Emami

For instance, while both 'step-by-step' and 'think' show high ZIP scores, which one is more influential depends on the model and task.

Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index

1 code implementation2 Dec 2024 Tyler McDonald, Anthony Colosimo, Yifeng Li, Ali Emami

As prompt engineering research rapidly evolves, evaluations beyond accuracy are crucial for developing cost-effective techniques.

Prompt Engineering

NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers

no code implementations2 Dec 2024 Angel Yahir Loredo Lopez, Tyler McDonald, Ali Emami

Large Language Models (LLMs) have shown impressive performance on various benchmarks, yet their ability to engage in deliberate reasoning remains questionable.

text-classification Text Classification

STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions

1 code implementation20 Sep 2024 Robert Morabito, Sangmitra Madhusudan, Tyler McDonald, Ali Emami

Mitigating explicit and implicit biases in Large Language Models (LLMs) has become a critical focus in the field of natural language processing.

Benchmarking

MirrorStories: Reflecting Diversity through Personalized Narrative Generation with Large Language Models

no code implementations20 Sep 2024 Sarfaroz Yunusov, Hamza Sidat, Ali Emami

This study explores the effectiveness of Large Language Models (LLMs) in creating personalized "mirror stories" that reflect and resonate with individual readers' identities, addressing the significant lack of diversity in literature.

Diversity

Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge

no code implementations25 May 2024 Brendan Park, Madeline Janecek, Naser Ezzati-Jivan, Yifeng Li, Ali Emami

Utilizing GPT-4 for prompt generation and Diffusion Attentive Attribution Maps (DAAM) for heatmap analysis, we propose a novel evaluation framework that isolates the models' ability in pronoun disambiguation from other visual processing challenges.

Common Sense Reasoning

Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models

1 code implementation25 May 2024 Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami

Using various datasets and prompting techniques that encourage model introspection, we probe the alignment between models' internal and expressed confidence.

Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models

1 code implementation23 May 2024 Abhishek Kumar, Sarfaroz Yunusov, Ali Emami

Research on Large Language Models (LLMs) has often neglected subtle biases that, although less apparent, can significantly influence the models' outputs toward particular social narratives.

EvoGrad: A Dynamic Take on the Winograd Schema Challenge with Human Adversaries

no code implementations20 Feb 2024 Jing Han Sun, Ali Emami

Our results emphasize the challenge posed by EvoGrad: Even the best performing LLM, GPT-3. 5, achieves an accuracy of 65. 0% with an average error depth of 7. 2, a stark contrast to human performance of 92.

Common Sense Reasoning coreference-resolution

WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts

1 code implementation31 Jan 2024 Pardis Sadat Zahraei, Ali Emami

The Winograd Schema Challenge (WSC) serves as a prominent benchmark for evaluating machine understanding.

valid

Debiasing should be Good and Bad: Measuring the Consistency of Debiasing Techniques in Language Models

1 code implementation23 May 2023 Robert Morabito, Jad Kabbara, Ali Emami

Debiasing methods that seek to mitigate the tendency of Language Models (LMs) to occasionally output toxic or inappropriate text have recently gained traction.

Dynamic-Pix2Pix: Noise Injected cGAN for Modeling Input and Target Domain Joint Distributions with Limited Training Data

1 code implementation15 Nov 2022 Mohammadreza Naderi, Nader Karimi, Ali Emami, Shahram Shirani, Shadrokh Samavi

Helping the cGAN learn the target distribution from noise input results in a better model generalization during the test time and allows the model to fit almost perfectly to the target domain distribution.

Domain Generalization

An Application of Pseudo-Log-Likelihoods to Natural Language Scoring

no code implementations23 Jan 2022 Darren Abramson, Ali Emami

We identify a practical cost for our method and model: high GPU-time for natural language evaluation.

Language Modeling Language Modelling +3

Not-so fine-tuning: Measures of Common Sense for Language Models

no code implementations29 Sep 2021 Darren Abramson, Ali Emami

Language models built using semi-supervised machine learning on large corpora of natural language have very quickly enveloped the fields of natural language generation and understanding.

Language Modelling Text Generation +1

An Analysis of Dataset Overlap on Winograd-Style Tasks

1 code implementation COLING 2020 Ali Emami, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung

The Winograd Schema Challenge (WSC) and variants inspired by it have become important benchmarks for common-sense reasoning (CSR).

Common Sense Reasoning

Localization of Fetal Head in Ultrasound Images by Multiscale View and Deep Neural Networks

no code implementations3 Nov 2019 Zahra Sobhaninia, Ali Emami, Nader Karimi, Shadrokh Samavi

One of the routine examinations that are used for prenatal care in many countries is ultrasound imaging.

Fetal Ultrasound Image Segmentation for Measuring Biometric Parameters Using Multi-Task Deep Learning

no code implementations31 Aug 2019 Zahra Sobhaninia, Shima Rafiei, Ali Emami, Nader Karimi, Kayvan Najarian, Shadrokh Samavi, S. M. Reza Soroushmehr

Ultrasound imaging is a standard examination during pregnancy that can be used for measuring specific biometric parameters towards prenatal diagnosis and estimating gestational age.

Image Segmentation Segmentation +1

The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution

1 code implementation ACL 2019 Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung

To explain this performance gap, we show empirically that state-of-the art models often fail to capture context, instead relying on the gender or number of candidate antecedents to make a decision.

Common Sense Reasoning coreference-resolution +3

ReDMark: Framework for Residual Diffusion Watermarking on Deep Networks

1 code implementation16 Oct 2018 Mahdi Ahmadi, Alireza Norouzi, S. M. Reza Soroushmehr, Nader Karimi, Kayvan Najarian, Shadrokh Samavi, Ali Emami

Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, application of Convolutional Neural Networks for watermarking have recently emerged.

A Knowledge Hunting Framework for Common Sense Reasoning

no code implementations EMNLP 2018 Ali Emami, Noelia De La Cruz, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung

We introduce an automatic system that achieves state-of-the-art results on the Winograd Schema Challenge (WSC), a common sense reasoning task that requires diverse, complex forms of inference and knowledge.

Common Sense Reasoning Coreference Resolution

A Generalized Knowledge Hunting Framework for the Winograd Schema Challenge

no code implementations NAACL 2018 Ali Emami, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung

We introduce an automatic system that performs well on two common-sense reasoning tasks, the Winograd Schema Challenge (WSC) and the Choice of Plausible Alternatives (COPA).

Common Sense Reasoning Coreference Resolution +1

Cannot find the paper you are looking for? You can Submit a new open access paper.