2 code implementations • 13 Aug 2024 • Imagen-Team-Google, :, Jason Baldridge, Jakob Bauer, Mukul Bhutani, Nicole Brichtova, Andrew Bunner, Lluis Castrejon, Kelvin Chan, YiChang Chen, Sander Dieleman, Yuqing Du, Zach Eaton-Rosen, Hongliang Fei, Nando de Freitas, Yilin Gao, Evgeny Gladchenko, Sergio Gómez Colmenarejo, Mandy Guo, Alex Haig, Will Hawkins, Hexiang Hu, Huilian Huang, Tobenna Peter Igwe, Siavash Khodadadeh, Yelin Kim, Ksenia Konyushkova, Karol Langner, Eric Lau, Rory Lawton, Shixin Luo, Soňa Mokrá, Henna Nandwani, Yasumasa Onoe, Aäron van den Oord, Zarana Parekh, Jordi Pont-Tuset, Hang Qi, Rui Qian, Deepak Ramachandran, Poorva Rane, Abdullah Rashwan, Robert Riachi, Hansa Srinivasan, Srivatsan Srinivasan, Robin Strudel, Benigno Uria, Oliver Wang, Su Wang, Austin Waters, Chris Wolff, Auriel Wright, Zhisheng Xiao, Hao Xiong, Keyang Xu, Marc van Zee, Junlin Zhang, Katie Zhang, Wenlei Zhou, Konrad Zolna, Ola Aboubakar, Canfer Akbulut, Oscar Akerlund, Isabela Albuquerque, Nina Anderson, Marco Andreetto, Lora Aroyo, Ben Bariach, David Barker, Sherry Ben, Dana Berman, Courtney Biles, Irina Blok, Pankil Botadra, Jenny Brennan, Karla Brown, John Buckley, Rudy Bunel, Elie Bursztein, Christina Butterfield, Ben Caine, Viral Carpenter, Norman Casagrande, Ming-Wei Chang, Solomon Chang, Shamik Chaudhuri, Tony Chen, John Choi, Dmitry Churbanau, Nathan Clement, Matan Cohen, Forrester Cole, Mikhail Dektiarev, Vincent Du, Praneet Dutta, Tom Eccles, Ndidi Elue, Ashley Feden, Shlomi Fruchter, Frankie Garcia, Roopal Garg, Weina Ge, Ahmed Ghazy, Bryant Gipson, Andrew Goodman, Dawid Górny, Sven Gowal, Khyatti Gupta, Yoni Halpern, Yena Han, Susan Hao, Jamie Hayes, Jonathan Heek, Amir Hertz, Ed Hirst, Emiel Hoogeboom, Tingbo Hou, Heidi Howard, Mohamed Ibrahim, Dirichi Ike-Njoku, Joana Iljazi, Vlad Ionescu, William Isaac, Reena Jana, Gemma Jennings, Donovon Jenson, Xuhui Jia, Kerry Jones, Xiaoen Ju, Ivana Kajic, Christos Kaplanis, Burcu Karagol Ayan, Jacob Kelly, Suraj Kothawade, Christina Kouridi, Ira Ktena, Jolanda Kumakaw, Dana Kurniawan, Dmitry Lagun, Lily Lavitas, Jason Lee, Tao Li, Marco Liang, Maggie Li-Calis, Yuchi Liu, Javier Lopez Alberca, Matthieu Kim Lorrain, Peggy Lu, Kristian Lum, Yukun Ma, Chase Malik, John Mellor, Thomas Mensink, Inbar Mosseri, Tom Murray, Aida Nematzadeh, Paul Nicholas, Signe Nørly, João Gabriel Oliveira, Guillermo Ortiz-Jimenez, Michela Paganini, Tom Le Paine, Roni Paiss, Alicia Parrish, Anne Peckham, Vikas Peswani, Igor Petrovski, Tobias Pfaff, Alex Pirozhenko, Ryan Poplin, Utsav Prabhu, Yuan Qi, Matthew Rahtz, Cyrus Rashtchian, Charvi Rastogi, Amit Raul, Ali Razavi, Sylvestre-Alvise Rebuffi, Susanna Ricco, Felix Riedel, Dirk Robinson, Pankaj Rohatgi, Bill Rosgen, Sarah Rumbley, MoonKyung Ryu, Anthony Salgado, Tim Salimans, Sahil Singla, Florian Schroff, Candice Schumann, Tanmay Shah, Eleni Shaw, Gregory Shaw, Brendan Shillingford, Kaushik Shivakumar, Dennis Shtatnov, Zach Singer, Evgeny Sluzhaev, Valerii Sokolov, Thibault Sottiaux, Florian Stimberg, Brad Stone, David Stutz, Yu-Chuan Su, Eric Tabellion, Shuai Tang, David Tao, Kurt Thomas, Gregory Thornton, Andeep Toor, Cristian Udrescu, Aayush Upadhyay, Cristina Vasconcelos, Alex Vasiloff, Andrey Voynov, Amanda Walker, Luyu Wang, Miaosen Wang, Simon Wang, Stanley Wang, Qifei Wang, Yuxiao Wang, Ágoston Weisz, Olivia Wiles, Chenxia Wu, Xingyu Federico Xu, Andrew Xue, Jianbo Yang, Luo Yu, Mete Yurtoglu, Ali Zand, Han Zhang, Jiageng Zhang, Catherine Zhao, Adilet Zhaxybay, Miao Zhou, Shengqi Zhu, Zhenkai Zhu, Dawn Bloxwich, Mahyar Bordbar, Luis C. Cobo, Eli Collins, Shengyang Dai, Tulsee Doshi, Anca Dragan, Douglas Eck, Demis Hassabis, Sissie Hsiao, Tom Hume, Koray Kavukcuoglu, Helen King, Jack Krawczyk, Yeqing Li, Kathy Meier-Hellstern, Andras Orban, Yury Pinsky, Amar Subramanya, Oriol Vinyals, Ting Yu, Yori Zwols
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts.
1 code implementation • 16 Jul 2024 • Yanting Miao, William Loh, Suraj Kothawade, Pascal Poupart, Abdullah Rashwan, Yeqing Li
In this work, we present the $\lambda$-Harmonic reward function, which provides a reliable reward signal and enables early stopping for faster training and effective regularization.
no code implementations • 8 Jul 2024 • Emaad Khwaja, Abdullah Rashwan, Ting Chen, Oliver Wang, Suraj Kothawade, Yeqing Li
We present a one-shot text-to-image diffusion model that can generate high-resolution images from natural language descriptions.
no code implementations • 7 Feb 2024 • Shivang Chopra, Suraj Kothawade, Houda Aynaou, Aman Chadha
This paper introduces a novel approach to leverage the generalizability of Diffusion Models for Source-Free Domain Adaptation (DM-SFDA).
Source-Free Domain Adaptation Unsupervised Domain Adaptation
no code implementations • 2 Oct 2023 • Shivang Chopra, Suraj Kothawade, Houda Aynaou, Aman Chadha
Domain Adaptation (DA) is a method for enhancing a model's performance on a target domain with inadequate annotated data by applying the information the model has acquired from a related source domain with sufficient labeled data.
no code implementations • 29 Sep 2023 • Anay Majee, Suraj Kothawade, KrishnaTeja Killamsetty, Rishabh Iyer
In this paper we introduce the SCoRe (Submodular Combinatorial Representation Learning) framework, a novel approach in representation learning that addresses inter-class bias and intra-class variance.
no code implementations • 28 Sep 2023 • Ke Yu, Stephen Albro, Giulia Desalvo, Suraj Kothawade, Abdullah Rashwan, Sasan Tavakkol, Kayhan Batmanghelich, Xiaoqi Yin
Training high-quality instance segmentation models requires an abundance of labeled images with instance masks and classifications, which is often expensive to procure.
no code implementations • 2 Jun 2023 • Nathan Beck, KrishnaTeja Killamsetty, Suraj Kothawade, Rishabh Iyer
Active Learning (AL) is a human-in-the-loop framework to interactively and adaptively label data instances, thereby enabling significant gains in model performance compared to random sampling.
1 code implementation • 18 May 2023 • Nathan Beck, Suraj Kothawade, Pradeep Shenoy, Rishabh Iyer
However, learning unbiased models depends on building a dataset that is representative of a diverse range of realistic scenarios for a given task.
no code implementations • 4 Oct 2022 • Suraj Kothawade, Akshit Srivastava, Venkat Iyer, Ganesh Ramakrishnan, Rishabh Iyer
Avoiding out-of-distribution (OOD) data is critical for training supervised machine learning models in the medical imaging domain.
no code implementations • 4 Oct 2022 • Suraj Kothawade, Atharv Savarkar, Venkat Iyer, Lakshman Tamil, Ganesh Ramakrishnan, Rishabh Iyer
It is often the case that a suboptimal performance is obtained on some classes due to the natural class imbalance issue that comes with medical data.
no code implementations • 5 Jul 2022 • Suraj Kothawade, Donna Roy, Michele Fenzi, Elmar Haussmann, Jose M. Alvarez, Christoph Angerer
Existing semantic image retrieval methods often focus on mining for larger sized geographical landmarks, and/or require extra labeled data, such as images/image-pairs with similar objects, for mining images with generic objects.
no code implementations • 17 Jun 2022 • Suraj Kothawade, Shivang Chopra, Saikat Ghosh, Rishabh Iyer
Most approaches assume access to a seed set of instances which contain these rare data instances.
no code implementations • 10 Mar 2022 • Suraj Kothawade, Pavan Kumar Reddy, Ganesh Ramakrishnan, Rishabh Iyer
This issue is further pronounced in SSL methods, as they would use this biased model to obtain psuedo-labels (on the unlabeled data) during training.
1 code implementation • 30 Jan 2022 • Changbin Li, Suraj Kothawade, Feng Chen, Rishabh Iyer
Meta learning has proven to be able to learn a parametrized model for FSC by training on various other classification tasks.
1 code implementation • 30 Nov 2021 • Suraj Kothawade, Saikat Ghosh, Sumit Shekhar, Yu Xiang, Rishabh Iyer
We propose TALISMAN, a novel framework for Targeted Active Learning or object detectIon with rare slices using Submodular MutuAl iNformation.
no code implementations • 17 Oct 2021 • Suraj Kothawade, Vinaya Khandelwal, Kinjal Basu, Huaduo Wang, Gopal Gupta
That is, while machine learning technology is good for observing and automatically understanding the surroundings of an automobile, driving decisions are better automated via commonsense reasoning rather than machine learning.
no code implementations • 10 Oct 2021 • Suraj Kothawade, Anmol Mekala, Chandra Sekhara D, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi
To address this problem, we propose DITTO (Data-efficient and faIr Targeted subseT selectiOn) that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • NeurIPS 2021 • Suraj Kothawade, Nathan Beck, KrishnaTeja Killamsetty, Rishabh Iyer
Active learning has proven to be useful for minimizing labeling costs by selecting the most informative samples.
no code implementations • 30 Apr 2021 • Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer
With the rapid growth of data, it is becoming increasingly difficult to train or improve deep learning models with the right subset of data.
1 code implementation • 27 Feb 2021 • Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer
Examples of such problems include: i)targeted learning, where the goal is to find subsets with rare classes or rare attributes on which the model is underperforming, and ii)guided summarization, where data (e. g., image collection, text, document or video) is summarized for quicker human consumption with specific additional user intent.
no code implementations • 26 Jan 2021 • Vishal Kaushal, Suraj Kothawade, Anshul Tomar, Rishabh Iyer, Ganesh Ramakrishnan
For long videos, human reference summaries necessary for supervised video summarization techniques are difficult to obtain.
no code implementations • 16 Oct 2020 • Suraj Kothawade, Jiten Girdhar, Chandrashekhar Lavania, Rishabh Iyer
Unfortunately, these models only learn the relative importance of the different submodular functions (such as diversity, representation or importance), but cannot learn more complex feature representations, which are often required for state-of-the-art performance.
no code implementations • 12 Oct 2020 • Vishal Kaushal, Suraj Kothawade, Ganesh Ramakrishnan, Jeff Bilmes, Himanshu Asnani, Rishabh Iyer
We study submodular information measures as a rich framework for generic, query-focused, privacy sensitive, and update summarization tasks.
no code implementations • 29 Jul 2020 • Vishal Kaushal, Suraj Kothawade, Rishabh Iyer, Ganesh Ramakrishnan
Thirdly, we demonstrate that in the presence of multiple ground truth summaries (due to the highly subjective nature of the task), learning from a single combined ground truth summary using a single loss function is not a good idea.
no code implementations • 3 Jan 2019 • Vishal Kaushal, Rishabh Iyer, Khoshrav Doctor, Anurag Sahoo, Pratik Dubal, Suraj Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan
This paper addresses automatic summarization of videos in a unified manner.
2 code implementations • 3 Jan 2019 • Vishal Kaushal, Rishabh Iyer, Suraj Kothawade, Rohan Mahadev, Khoshrav Doctor, Ganesh Ramakrishnan
Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry.
no code implementations • 26 Sep 2018 • Suraj Kothawade, Kunjan Mhaske, Sahil Sharma, Furkhan Shaikh
Satellite Remote Sensing Technology is becoming a major milestone in the prediction of weather anomalies, natural disasters as well as finding alternative resources in proximity using multiple multi-spectral sensors emitting electromagnetic waves at distinct wavelengths.
no code implementations • 24 Sep 2018 • Vishal Kaushal, Sandeep Subramanian, Suraj Kothawade, Rishabh Iyer, Ganesh Ramakrishnan
We propose a novel framework for domain specific video summarization.
1 code implementation • 24 Sep 2018 • Rishabh Iyer, Pratik Dubal, Kunal Dargan, Suraj Kothawade, Rohan Mahadev, Vishal Kaushal
With increasing amounts of visual data being created in the form of videos and images, visual data selection and summarization are becoming ever increasing problems.
no code implementations • 27 May 2018 • Pratik Dubal, Rohan Mahadev, Suraj Kothawade, Kunal Dargan, Rishabh Iyer
To our knowledge, this is the first work which provides a comprehensive evaluation of different deep learning models on various real-world customer deployment scenarios of surveillance video analytics.