Search Results for author: Maitreya Patel

Found 14 papers, 9 papers with code

Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

no code implementations27 Nov 2024 Maitreya Patel, Song Wen, Dimitris N. Metaxas, Yezhou Yang

In this work, we first develop a theoretical and empirical understanding of the vector field dynamics of RFMs in efficiently guiding the denoising trajectory.

Denoising Image Generation +1

Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model

1 code implementation7 Nov 2024 Sheng Cheng, Maitreya Patel, Yezhou Yang

Despite advancements in text-to-image models, generating images that precisely align with textual descriptions remains challenging due to misalignment in training data.

Image Captioning Text-to-Image Generation

Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?

1 code implementation17 Oct 2024 Shailaja Keyur Sampat, Maitreya Patel, Yezhou Yang, Chitta Baral

An ability to learn about new objects from a small amount of visual data and produce convincing linguistic justification about the presence/absence of certain concepts (that collectively compose the object) in novel scenarios is an important characteristic of human cognition.

Language Modeling Language Modelling +4

$λ$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

1 code implementation7 Feb 2024 Maitreya Patel, Sangmin Jung, Chitta Baral, Yezhou Yang

While LDMs offer distinct advantages, P-T2I methods' reliance on the latent space of these diffusion models significantly escalates resource demands, leading to inconsistent results and necessitating numerous iterations for a single desired image.

Concept Alignment Philosophy

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

no code implementations CVPR 2024 Maitreya Patel, Changhoon Kim, Sheng Cheng, Chitta Baral, Yezhou Yang

The T2I prior model alone adds a billion parameters compared to the Latent Diffusion Models, which increases the computational and high-quality data requirements.

Contrastive Learning Decoder

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models

1 code implementation7 Jun 2023 Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang

To quantify the ability of T2I models in learning and synthesizing novel visual concepts (a. k. a.

Concept Alignment

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

1 code implementation CVPR 2024 Changhoon Kim, Kyle Min, Maitreya Patel, Sheng Cheng, Yezhou Yang

This paper introduces a novel approach to model fingerprinting that assigns responsibility for the generated images, thereby serving as a potential countermeasure to model misuse.

Misinformation

CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion

1 code implementation18 Aug 2020 Maitreya Patel, Mirali Purohit, Jui Shah, Hemant A. Patil

The CycleGAN-based method uses two different models, one for Mel Cepstral Coefficients (MCC) mapping, and another for F0 prediction, where F0 is highly dependent on the pre-trained model of MCC mapping.

Prediction Voice Conversion

AdaGAN: Adaptive GAN for Many-to-Many Non-Parallel Voice Conversion

1 code implementation25 Sep 2019 Maitreya Patel, Mirali Purohit, Mihir Parmar, Nirmesh J. Shah, Hemant A. Patil

In this paper, we propose a novel style transfer architecture, which can also be extended to generate voices even for target speakers whose data were not used in the training (i. e., case of zero-shot learning).

Generative Adversarial Network Style Transfer +2

Precipitation Nowcasting: Leveraging bidirectional LSTM and 1D CNN

no code implementations24 Oct 2018 Maitreya Patel, Anery Patel, Dr. Ranendu Ghosh

Short-term rainfall forecasting, also known as precipitation nowcasting has become a potentially fundamental technology impacting significant real-world applications ranging from flight safety, rainstorm alerts to farm irrigation timings.

Deep Learning Time Series +2

Cannot find the paper you are looking for? You can Submit a new open access paper.