Search Results for author: Svetlana Lazebnik

Found 46 papers, 14 papers with code

Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now

no code implementations • 28 Nov 2023 • Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra, Svetlana Lazebnik, D. A. Forsyth, Anand Bhattad

All three classifiers are denied access to image pixels, and look only at derived geometric features.

Paper
Add Code

Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images

1 code implementation • 27 Nov 2023 • Aiyu Cui, Jay Mahajan, Viraj Shah, Preeti Gomathinayagam, Chang Liu, Svetlana Lazebnik

By contrast, it is hard to collect paired data for in-the-wild scenes, and therefore, virtual try-on for casual images of people with more diverse poses against cluttered backgrounds is rarely studied.

Ranked #1 on Virtual Try-on on StreetTryOn

Image Generation Semantic Segmentation +4

Paper
Code

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

1 code implementation • 22 Nov 2023 • Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani

Experiments on a wide range of subject and style combinations show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity while preserving the ability to recontextualize.

457

Paper
Code

JoIN: Joint GANs Inversion for Intrinsic Image Decomposition

no code implementations • 18 May 2023 • Viraj Shah, Svetlana Lazebnik, Julien Philip

In this work, we propose to solve ill-posed inverse imaging problems using a bank of Generative Adversarial Networks (GAN) as a prior and apply our method to the case of Intrinsic Image Decomposition for faces and materials.

Image Relighting Intrinsic Image Decomposition

Paper
Add Code

One-Shot Stylization for Full-Body Human Images

no code implementations • 14 Apr 2023 • Aiyu Cui, Svetlana Lazebnik

Since body shape deformation is an essential component of an art character's style, we incorporate a novel skeleton deformation module to reshape the pose of the input person and modify the DiOr pose-guided person generator to be more robust to the rescaled poses falling outside the distribution of the realistic poses that the generator is originally trained on.

Paper
Add Code

Robust Online Video Instance Segmentation with Track Queries

1 code implementation • 16 Nov 2022 • Zitong Zhan, Daniel McKee, Svetlana Lazebnik

We propose a fully online transformer-based video instance segmentation model that performs comparably to top offline methods on the YouTube-VIS 2019 benchmark and considerably outperforms them on UVO and OVIS.

Ranked #13 on Video Instance Segmentation on OVIS validation

Image Segmentation Instance Segmentation +6

Paper
Code

MultiStyleGAN: Multiple One-shot Image Stylizations using a Single GAN

no code implementations • 8 Oct 2022 • Viraj Shah, Ayush Sarkar, Sudharsan Krishnakumar Anitha, Svetlana Lazebnik

Recent approaches for one-shot stylization such as JoJoGAN fine-tune a pre-trained StyleGAN2 generator on a single style reference image.

One-Shot Face Stylization

Paper
Add Code

Transfer of Representations to Video Label Propagation: Implementation Factors Matter

no code implementations • 10 Mar 2022 • Daniel McKee, Zitong Zhan, Bing Shuai, Davide Modolo, Joseph Tighe, Svetlana Lazebnik

This work studies feature representations for dense label propagation in video, with a focus on recently proposed methods that learn video correspondence using self-supervised signals such as colorization or temporal cycle consistency.

Colorization

Paper
Add Code

Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents

no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang

We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.

Paper
Add Code

Multi-Object Tracking with Hallucinated and Unlabeled Videos

no code implementations • 19 Aug 2021 • Daniel McKee, Bing Shuai, Andrew Berneshawi, Manchen Wang, Davide Modolo, Svetlana Lazebnik, Joseph Tighe

Next, to tackle harder tracking cases, we mine hard examples across an unlabeled pool of real videos with a tracker trained on our hallucinated video data.

Multi-Object Tracking Object

Paper
Add Code

Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing

1 code implementation • ICCV 2021 • Aiyu Cui, Daniel McKee, Svetlana Lazebnik

We propose a flexible person generation framework called Dressing in Order (DiOr), which supports 2D pose transfer, virtual try-on, and several fashion editing tasks.

Ranked #1 on Pose Transfer on Deep-Fashion

Fashion Synthesis Pose Transfer +1

491

Paper
Code

GridToPix: Training Embodied Agents with Minimal Supervision

no code implementations • ICCV 2021 • Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.

PointGoal Navigation Reinforcement Learning (RL) +1

Paper
Add Code

Bridging the Imitation Gap by Adaptive Insubordination

no code implementations • NeurIPS 2021 • Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.

Imitation Learning Memorization +2

Paper
Add Code

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

Autonomous agents must learn to collaborate.

Paper
Add Code

Memory-Efficient Incremental Learning Through Feature Adaptation

no code implementations • ECCV 2020 • Ahmet Iscen, Jeffrey Zhang, Svetlana Lazebnik, Cordelia Schmid

We assume that the model is updated incrementally for new classes as new data becomes available sequentially. This requires adapting the previously stored feature vectors to the updated feature space without having access to the corresponding original training images.

Incremental Learning

Paper
Add Code

Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation

no code implementations • 28 May 2019 • Zih-Siou Hung, Arun Mallya, Svetlana Lazebnik

The previous VTransE model maps entities and predicates into a low-dimensional embedding vector space where the predicate is interpreted as a translation vector between the embedded features of the bounding box regions of the subject and the object.

Graph Generation Object +4

Paper
Add Code

Two Body Problem: Collaborative Visual Task Completion

no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi

Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.

Task 2 Vocal Bursts Valence Prediction

Paper
Add Code

Revisiting Image-Language Networks for Open-ended Phrase Detection

3 code implementations • 17 Nov 2018 • Bryan A. Plummer, Kevin J. Shih, Yichen Li, Ke Xu, Svetlana Lazebnik, Stan Sclaroff, Kate Saenko

Most existing work that grounds natural language phrases in images starts with the assumption that the phrase in question is relevant to the image.

object-detection Object Detection +1

Paper
Code

Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering

no code implementations • NeurIPS 2018 • Medhini Narasimhan, Svetlana Lazebnik, Alexander G. Schwing

Given a question-image pair, deep network techniques have been employed to successively reduce the large set of facts until one of the two entities of the final remaining fact is predicted as the answer.

Factual Visual Question Answering General Knowledge +2

Paper
Add Code

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

no code implementations • CVPR 2018 • Unnat Jain, Svetlana Lazebnik, Alexander Schwing

In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.

Ranked #7 on Visual Dialog on VisDial v0.9 val

Image Captioning Question Answering +4

Paper
Add Code

Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

1 code implementation • ECCV 2018 • Arun Mallya, Dillon Davis, Svetlana Lazebnik

This work presents a method for adapting a single, fixed deep neural network to multiple tasks without affecting performance on already learned tasks.

Ranked #1 on Continual Learning on ImageNet (Fine-grained 6 Tasks)

Continual Learning Quantization

179

Paper
Code

Conditional Image-Text Embedding Networks

1 code implementation • ECCV 2018 • Bryan A. Plummer, Paige Kordas, M. Hadi Kiapour, Shuai Zheng, Robinson Piramuthu, Svetlana Lazebnik

This paper presents an approach for grounding phrases in images which jointly learns multiple text-conditioned embeddings in a single end-to-end model.

Phrase Grounding

Paper
Code

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

no code implementations • NeurIPS 2017 • Liwei Wang, Alexander G. Schwing, Svetlana Lazebnik

This paper explores image caption generation using conditional variational auto-encoders (CVAEs).

Caption Generation

Paper
Add Code

PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

4 code implementations • CVPR 2018 • Arun Mallya, Svetlana Lazebnik

This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting.

Ranked #5 on Continual Learning on ImageNet (Fine-grained 6 Tasks)

Continual Learning Network Pruning

230

Paper
Code

Enhancing Video Summarization via Vision-Language Embedding

no code implementations • CVPR 2017 • Bryan A. Plummer, Matthew Brown, Svetlana Lazebnik

This paper addresses video summarization, or the problem of distilling a raw video into a shorter form while still capturing the original story.

Video Summarization

Paper
Add Code

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

1 code implementation • 11 Apr 2017 • Liwei Wang, Yin Li, Jing Huang, Svetlana Lazebnik

Image-language matching tasks have recently attracted a lot of attention in the computer vision field.

Image-text matching Retrieval +4

Paper
Code

Recurrent Models for Situation Recognition

no code implementations • ICCV 2017 • Arun Mallya, Svetlana Lazebnik

This work proposes Recurrent Neural Network (RNN) models to predict structured 'image situations' -- actions and noun entities fulfilling semantic roles related to the action.

Ranked #10 on Grounded Situation Recognition on SWiG

Grounded Situation Recognition Human-Object Interaction Detection +1

Paper
Add Code

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues

1 code implementation • ICCV 2017 • Bryan A. Plummer, Arun Mallya, Christopher M. Cervantes, Julia Hockenmaier, Svetlana Lazebnik

This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues.

Attribute Position +2

Paper
Code

Combining Multiple Cues for Visual Madlibs Question Answering

no code implementations • 1 Nov 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.

Attribute General Classification +3

Paper
Add Code

Solving Visual Madlibs with Multiple Cues

no code implementations • 11 Aug 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.

Activity Prediction Attribute +4

Paper
Add Code

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

no code implementations • 16 Apr 2016 • Arun Mallya, Svetlana Lazebnik

This paper proposes deep convolutional network models that utilize local and global context to make human activity label predictions in still images, achieving state-of-the-art performance on two recent datasets with hundreds of labels each.

Ranked #6 on Human-Object Interaction Detection on HICO

General Classification Human-Object Interaction Detection +4

Paper
Add Code

Adaptive Object Detection Using Adjacency and Zoom Prediction

1 code implementation • CVPR 2016 • Yongxi Lu, Tara Javidi, Svetlana Lazebnik

Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small.

Object object-detection +1

Paper
Code

Where to Buy It: Matching Street Clothing Photos in Online Shops

no code implementations • ICCV 2015 • M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg

In this paper, we define a new task, Exact Street to Shop, where our goal is to match a real-world example of a garment item to the same item in an online shop.

Retrieval

Paper
Add Code

Learning Informative Edge Maps for Indoor Scene Layout Prediction

no code implementations • ICCV 2015 • Arun Mallya, Svetlana Lazebnik

We learn to predict 'informative edge' probability maps using two recent methods that exploit local and global context, respectively: structured edge detection forests, and a fully convolutional network for pixelwise labeling.

Edge Detection

Paper
Add Code

Learning Deep Structure-Preserving Image-Text Embeddings

no code implementations • CVPR 2016 • Liwei Wang, Yin Li, Svetlana Lazebnik

This paper proposes a method for learning joint embeddings of images and text using a two-branch neural network with multiple layers of linear projections followed by nonlinearities.

Ranked #15 on Image Retrieval on Flickr30K 1K test

Image Retrieval Metric Learning +2

Paper
Add Code

Active Object Localization with Deep Reinforcement Learning

3 code implementations • ICCV 2015 • Juan C. Caicedo, Svetlana Lazebnik

We present an active detection model for localizing objects in scenes.

Active Object Localization Object +2

Paper
Code

Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models

2 code implementations • ICCV 2015 • Bryan A. Plummer, Li-Wei Wang, Chris M. Cervantes, Juan C. Caicedo, Julia Hockenmaier, Svetlana Lazebnik

The Flickr30k dataset has become a standard benchmark for sentence-based image description.

Ranked #17 on Image Retrieval on Flickr30K 1K test

Retrieval Sentence

149

Paper
Code

Training Deeper Convolutional Networks with Deep Supervision

1 code implementation • 11 May 2015 • Liwei Wang, Chen-Yu Lee, Zhuowen Tu, Svetlana Lazebnik

One of the most promising ways of improving the performance of deep convolutional neural networks is by increasing the number of convolutional layers.

General Classification

Paper
Code

Scene Parsing with Object Instances and Occlusion Ordering

no code implementations • CVPR 2014 • Joseph Tighe, Marc Niethammer, Svetlana Lazebnik

This work proposes a method to interpret a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships.

Object Scene Parsing +1

Paper
Add Code

Multi-scale Orderless Pooling of Deep Convolutional Activation Features

no code implementations • 7 Mar 2014 • Yunchao Gong, Li-Wei Wang, Ruiqi Guo, Svetlana Lazebnik

Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition.

Classification General Classification +2

Paper
Add Code

Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

no code implementations • CVPR 2013 • Joseph Tighe, Svetlana Lazebnik

This paper presents a system for image parsing, or labeling each pixel in an image with its semantic category, aimed at achieving broad coverage across hundreds of object categories, many of them sparsely sampled.

Object

Paper
Add Code

Learning Binary Codes for High-Dimensional Data Using Bilinear Projections

no code implementations • CVPR 2013 • Yunchao Gong, Sanjiv Kumar, Henry A. Rowley, Svetlana Lazebnik

Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on largescale datasets like ImageNet, extremely high-dimensional visual descriptors, e. g., Fisher Vectors, are needed.

Classification Code Generation +4

Paper
Add Code

A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics

no code implementations • 18 Dec 2012 • Yunchao Gong, Qifa Ke, Michael Isard, Svetlana Lazebnik

This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation).

Clustering Image Retrieval +2

Paper
Add Code

Angular Quantization-based Binary Codes for Fast Similarity Search

no code implementations • NeurIPS 2012 • Yunchao Gong, Sanjiv Kumar, Vishal Verma, Svetlana Lazebnik

Such data typically arises in a large number of vision and text applications where counts or frequencies are used as features.

Quantization Retrieval +1

Paper
Add Code

Locality-sensitive binary codes from shift-invariant kernels

no code implementations • NeurIPS 2009 • Maxim Raginsky, Svetlana Lazebnik

This paper addresses the problem of designing binary codes for high-dimensional data such that vectors that are similar in the original space map to similar binary strings.

Paper
Add Code

Near-minimax recursive density estimation on the binary hypercube

no code implementations • NeurIPS 2008 • Maxim Raginsky, Svetlana Lazebnik, Rebecca Willett, Jorge Silva

This paper describes a recursive estimation procedure for multivariate binary densities using orthogonal expansions.

Density Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.