Search Results for author: Eliya Habba

Found 4 papers, 0 papers with code

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

no code implementations3 Mar 2025 Eliya Habba, Ofir Arviv, Itay Itzhak, Yotam Perlitz, Elron Bandel, Leshem Choshen, Michal Shmueli-Scheuer, Gabriel Stanovsky

Recent work found that LLMs are sensitive to a wide range of arbitrary prompt dimensions, including the type of delimiters, answer enumerators, instruction wording, and more.

Beyond Benchmarks: On The False Promise of AI Regulation

no code implementations26 Jan 2025 Gabriel Stanovsky, Renana Keydar, Gadi Perl, Eliya Habba

The rapid advancement of artificial intelligence (AI) systems in critical domains like healthcare, justice, and social services has sparked numerous regulatory initiatives aimed at ensuring their safe deployment.

Benchmarking

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

no code implementations28 Jul 2024 Nitzan Bitton-Guetta, Aviv Slobodkin, Aviya Maimon, Eliya Habba, Royi Rassin, Yonatan Bitton, Idan Szpektor, Amir Globerson, Yuval Elovici

To study these skills, we present Visual Riddles, a benchmark aimed to test vision and language models on visual riddles requiring commonsense and world knowledge.

World Knowledge

The Perfect Victim: Computational Analysis of Judicial Attitudes towards Victims of Sexual Violence

no code implementations9 May 2023 Eliya Habba, Renana Keydar, Dan Bareket, Gabriel Stanovsky

Second, we curate a manually annotated dataset for judicial assessments of victim's credibility in the Hebrew language, as well as a model that can extract credibility labels from court cases.

Cannot find the paper you are looking for? You can Submit a new open access paper.