Search Results for author: Sugato Basu

Found 16 papers, 8 papers with code

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

1 code implementation NeurIPS 2023 Weixi Feng, Wanrong Zhu, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang

When combined with a downstream image generation model, LayoutGPT outperforms text-to-image models/systems by 20-40% and achieves comparable performance as human users in designing visual layouts for numerical and spatial correctness.

Indoor Scene Synthesis Text-to-Image Generation

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

1 code implementation9 Dec 2022 Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang

In this work, we improve the compositional skills of T2I models, specifically more accurate attribute binding and better image compositions.

Attribute Image Generation

CPL: Counterfactual Prompt Learning for Vision and Language Models

no code implementations19 Oct 2022 Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang

Prompt tuning is a new few-shot transfer learning technique that only tunes the learnable prompt for pre-trained vision and language models such as CLIP.

counterfactual Visual Question Answering

A Framework for Deep Constrained Clustering

1 code implementation7 Jan 2021 Hongjing Zhang, Tianyang Zhan, Sugato Basu, Ian Davidson

A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering.

Constrained Clustering

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

no code implementations EMNLP 2020 Wanrong Zhu, Xin Eric Wang, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang

A major challenge in visually grounded language generation is to build robust benchmark datasets and models that can generalize well in real-world settings.

Text Generation

Leveraging Organizational Resources to Adapt Models to New Data Modalities

no code implementations23 Aug 2020 Sahaana Suri, Raghuveer Chanda, Neslihan Bulut, Pradyumna Narayana, Yemao Zeng, Peter Bailis, Sugato Basu, Girija Narlikar, Christopher Re, Abishek Sethi

As applications in large organizations evolve, the machine learning (ML) models that power them must adapt the same predictive tasks to newly arising data modalities (e. g., a new video content launch in a social media application requires existing text or image models to extend to video).

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

1 code implementation EACL 2021 Wanrong Zhu, Xin Eric Wang, Tsu-Jui Fu, An Yan, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang

Outdoor vision-and-language navigation (VLN) is such a task where an agent follows natural language instructions and navigates a real-life urban environment.

Ranked #4 on Vision and Language Navigation on Touchdown Dataset (using extra training data)

Style Transfer Text Style Transfer +1

HUSE: Hierarchical Universal Semantic Embeddings

8 code implementations14 Nov 2019 Pradyumna Narayana, Aniket Pednekar, Abishek Krishnamoorthy, Kazoo Sone, Sugato Basu

The works in the domain of visual semantic embeddings address this problem by first constructing a semantic embedding space based on some external knowledge and projecting image embeddings onto this fixed semantic embedding space.

General Classification Representation Learning +1

A Framework for Deep Constrained Clustering -- Algorithms and Advances

1 code implementation29 Jan 2019 Hongjing Zhang, Sugato Basu, Ian Davidson

The area of constrained clustering has been extensively explored by researchers and used by practitioners.

Constrained Clustering

Micro-Browsing Models for Search Snippets

no code implementations18 Oct 2018 Muhammad Asiful Islam, Ramakrishnan Srikant, Sugato Basu

CTR of a result has two core components: (a) the probability of examination of a result by a user, and (b) the perceived relevance of the result given that it has been examined by the user.

Graphical RNN Models

no code implementations15 Dec 2016 Ashish Bora, Sugato Basu, Joydeep Ghosh

Many time series are generated by a set of entities that interact with one another over time.

Time Series Time Series Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.