Search Results for author: Royi Rassin

Found 10 papers, 3 papers with code

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

no code implementations24 Apr 2025 Aviv Slobodkin, Hagai Taitelbaum, Yonatan Bitton, Brian Gordon, Michal Sokolik, Nitzan Bitton Guetta, Almog Gueta, Royi Rassin, Itay Laish, Dani Lischinski, Idan Szpektor

Subject-driven text-to-image (T2I) generation aims to produce images that align with a given textual description, while preserving the visual identity from a referenced subject image.

Text-to-Image Generation

GRADE: Quantifying Sample Diversity in Text-to-Image Models

no code implementations29 Oct 2024 Royi Rassin, Aviv Slobodkin, Shauli Ravfogel, Yanai Elazar, Yoav Goldberg

GRADE leverages the world knowledge embedded in large language models and visual question-answering systems to identify relevant concept-specific axes of diversity (e. g., ``shape'' and ``color'' for the concept ``cookie'').

Attribute Diversity +3

How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold

1 code implementation19 Oct 2024 Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar

We seek to determine the point at which a model was trained on enough instances to imitate a concept -- the imitation threshold.

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

no code implementations28 Jul 2024 Nitzan Bitton-Guetta, Aviv Slobodkin, Aviya Maimon, Eliya Habba, Royi Rassin, Yonatan Bitton, Idan Szpektor, Amir Globerson, Yuval Elovici

To study these skills, we present Visual Riddles, a benchmark aimed to test vision and language models on visual riddles requiring commonsense and world knowledge.

World Knowledge

Evaluating D-MERIT of Partial-annotation on Information Retrieval

no code implementations23 Jun 2024 Royi Rassin, Yaron Fairstein, Oren Kalinsky, Guy Kushilevitz, Nachshon Cohen, Alexander Libov, Yoav Goldberg

We show that evaluating on a dataset containing annotations for only a subset of the relevant passages might result in misleading ranking of the retrieval systems and that as more relevant texts are included in the evaluation set, the rankings converge.

Passage Retrieval Text Retrieval

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

1 code implementation14 Jun 2024 Lital Binyamin, Yoad Tewel, Hilit Segev, Eran Hirsch, Royi Rassin, Gal Chechik

Generating object-correct counts is fundamentally challenging because the generative model needs to keep a sense of separate identity for every instance of the object, even if several objects look identical or overlap, and then carry out a global computation implicitly during generation.

Denoising Object +1

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

no code implementations29 Nov 2023 Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian

Then, it uses two VLMs to select the best generation: a Visual Question Answering model that measures the alignment of generated images to the text, and another that measures the generation's aesthetic quality.

Question Answering Text-to-Image Generation +1

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment

2 code implementations NeurIPS 2023 Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg, Gal Chechik

This reflects an impaired mapping between linguistic binding of entities and modifiers in the prompt and visual binding of the corresponding elements in the generated image.

Attribute Sentence +1

Conjunct Resolution in the Face of Verbal Omissions

no code implementations26 May 2023 Royi Rassin, Yoav Goldberg, Reut Tsarfaty

In this work we propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.

Missing Elements Sentence +1

DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models

no code implementations19 Oct 2022 Royi Rassin, Shauli Ravfogel, Yoav Goldberg

We study the way DALLE-2 maps symbols (words) in the prompt to their references (entities or properties of entities in the generated image).

Cannot find the paper you are looking for? You can Submit a new open access paper.